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PREFACE 


Combinatorics played an important role in the development of probability theory 
and the two have continued to be closely related. Now probability theory, by 
offering new approaches to problems of discrete mathematics, is beginning to 
repay its debt to combinatorics. Among these new approaches, the methods of 
asymptotic analysis, which have been well developed in probability theory, can be 
used to solve certain complicated combinatorial problems. 

If the uniform distribution is defined on the set of combinatorial structures in 
question, then the numerical characteristics of the structures can be regarded as 
random variables and analyzed by probabilistic methods. By using the probabilistic 
approach, we restrict our attention to “typical” structures that constitute the bulk 
of the set, excluding the small fraction with exceptional properties. 

The probabilistic approach that is now widely used in combinatorics was first 
formulated by V. L. Goncharov, who applied it to S,, the set of all permuta- 
tions of degree n, and to the runs in random (0,1)-sequences. S. N. Bernstein, 
N. V. Smirnov, and V. E. Stepanov were among those who developed probabilis- 
tic combinatorics in Russia, building on the famous Russian school of probability 
founded by A. A. Markov, P. L. Lyapunov, A. Ya. Khinchin, and A. N. Kolmogorov. 

This book is based on results obtained primarily by Russian mathematicians and 
presents results on random graphs, systems of random linear equations in GF(2), 
random permutations, and some simple equations involving permutations. 

Selecting material for the book was a difficult job. Of course, this book is not 
a complete treatment of the topics mentioned. Some results (and their proofs) did 
not seem ready for inclusion in a book, and there may be relevant results that have 
escaped the author’s attention. 

There is a large body of literature on random graphs, and it is not possible to re- 
view it here. Among the probabilistic tools that have been used to analyze random 
structures are the method of moments, Poisson and Gaussian approximations, gen- 
erating functions using the saddle-point method, Tauberian-type theorems, analysis 
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of singularities, and martingale theory. In the past two decades, a method called 
the generalized scheme of allocation has been widely used in probabilistic com- 
binatorics. It is so named because of its connection with the problem of assigning 
n objects randomly to N cells. Let 71, ..., 7 be random variables that are, for 
example, the sizes of components of a graph. If there are independent random 
variables &|,...,&€y so that the joint distribution of 7),..., ny for any integers 
ki,..., kn can be written as 


P{m =h1,...,.9n =kn} = P{& =ky,...,En =ky |b +--- + Ev =n}, 


where n is a positive integer, then we say that 71, ..., 7 Satisfy the generalized 
scheme of allocation with parameters n and N and independent random variables 
&1,...,En. 

Graph evolution is the random process of sequentially adding new edges to a 
graph. For many classes of random graphs with n labeled vertices and T edges, the 
parameter 0 = 277/n plays a role of time in the process; various graph properties 
often change abruptly at the critical point 6 = 1. Graph evolution is the most 
fascinating object in the theory of random graphs, and it appears that it is well 
suited to the generalized scheme. We will show that applying generalized schemes 
makes it possible to analyze random graphs at different stages of their evolution 
and to obtain limit distributions in those cases in which only properties similar to 
the law of large numbers have been proved. 

The theory of random equations in finite fields is shared by probability, combi- 
natorics, and algebra. In this book, we will consider systems of linear equations in 
GF(2) with random coefficients. The matrix of such a system corresponds to a ran- 
dom graph or hypergraph; therefore, results on random graphs help to study these 
systems. We are sure that this application alone justifies developing the theory of 
random graphs. 

The theory of random permutations is a well-developed branch of probabilis- 
tic combinatorics. Although Goncharov has investigated the cycle structure of a 
random permutation in great detail, there is still great interest in this area. We will 
fully describe the asymptotic behavior of P{v, = k} for the total number v, of 
cycles in a random permutation for all possible behaviors of the parameters n and 
k = k(n) as n — oo. We will also give some of the asymptotic results for the 
number of solutions of the equation X 4 — e, where an unknown X € Sn, disa 
fixed positive integer, and e is the identity of the group S,. 

Although the generalized scheme of allocation cannot be applied to nonequi- 
probable graphs, we present some results in this situation by using the method 
of moments. The statistical applications of nonequiprobable graphs call for the 
development of regular methods of analyzing these structures. 

The book consists of five chapters. Chapter 1 describes the generalized scheme 
of allocation and its applications to a random forest of nonrooted trees, a random 


Preface xi 


graph consisting of unicyclic components, and a random graph with a mixture of 
trees and unicyclic components. In Chapter 2, these results are applied to the study 
of the evolution of random graphs. Chapter 3 is devoted to systems of random linear 
equations in GF(2). Much of this branch of probabilistic combinatorics is the work 
of Russian mathematicians; this is the first English-language presentation of many 
of the results. Random permutations are considered in Chapter 4, and Chapter 5 
contains some results on permutation equations of the form X¢ = e. 

Most results presented in this book derive from work done over the past fifteen 
years; notes and references can be found in the last section of each chapter. (It is, 
of course, impossible to give a complete list in each particular area.) In addition to 
articles used in the text, the summary sections of all chapters include references to 
papers on related topics, especially those in which the same results were obtained 
by other methods. 

We assume that the reader is familiar with basic combinatorics. This book 
should be accessible to those who have completed standard courses of mathemat- 
ical analysis and probability theory. Section 1.1 includes a list of pertinent results 
from probability. 

This book continues in the tradition of Random Mappings [78] and differs from 
other treatments of random graphs in the systematic use of the generalized scheme 
of allocation. We hope that the chapter on systems of random linear equations in 
GF(2) will be of interest to a broad audience. I wish to express my sincere appre- 
ciation to G.-C. Rota, who encouraged me to write this book for the Encyclopedia 
of Mathematics series, even though there are already several excellent books on 
random graphs. 

My greatest concern is writing the book in English. I am indebted to the editors 
who have brought the text to an acceptable form. It is apparent that no amount of 
editing can erase the heavy Russian accent of my written English, so my special 
thanks go to those readers who will not be deterred by the language of the book. 

I greatly appreciate the support I received from my colleagues at the Steklov 
Mathematical Institute while I wrote this book. 
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The generalized scheme of allocation 
and the components of 
random graphs 


1.1. The probabilistic approach to enumerative 
combinatorial problems 


The solution to enumerative combinatorial problems consists in finding an exact 
or approximate expression for the number of combinatorial objects possessing the 
property under investigation. In this book, the probabilistic approach to enumera- 
tive combinatorial problems is adopted. 

The fundamental notion of probability theory is the probability space (Q, A, P), 
where & is a set of arbitrary elements, A is a set of subsets of Q forming a o- 
algebra of events with the operations of union and intersection of sets, and P is 
a nonnegative countably additive function defined for each event A € A so that 
P(&2) = 1. The set Q is called the space of elementary events and P is a probability. 
A random variable is a real-valued measurable function = &(w) defined for all 
w€ Q. 

Suppose Q consists of finitely many elements. Then the probability P is defined 
on all subsets of Q if itis defined for each elementary event w € Q. In this case, any 
real-valued function € = &(w) on such a space of elementary events is a random 
variable. 

Instead of a real-valued function, one may consider a function f(w) taking 
values from some set Y of arbitrary elements. Such a function /(w) may be con- 
sidered a generalization of a random variable and is called a random element of 
the set Y. 

In studying combinatorial objects, we consider probability spaces that have a 
natural combinatorial interpretation: For the space of elementary events Q, we take 
the set of combinatorial objects under investigation and assign the same probability 
to all the elements of the set. In this case, numerical characteristics of combinatorial 
objects of £2 become random variables. The term “random element of the set (2” 
is usually used for the identity function f(w) = w, w € &, mapping each element 
of the set of combinatorial objects into itself. Since the uniform distribution is 
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assumed on &2, the probability that the identity function f takes any fixed value 
w is the same for all m € &2. Hence the notion of a random combinatorial object 
of 2, such as the identity function f(w) = w, agrees with the usual notion of a 
random element of a set as an element sampled from all elements of the set with 
equal probabilities. 

Note that a random combinatorial object with the same distribution could also 
be defined on larger probability spaces. For our purposes, however, the natural 
construction presented here is sufficient for the most part. The exceptions are 
those few cases that involve several independent random combinatorial objects 
and in which it would be necessary to resort to a richer probability space, such as 
the direct product of the natural probability spaces. 

Since we use probability spaces with uniform distributions, in spite of the proba- 
bilistic terminology, the problems considered are in essence enumeration problems 
of combinatorial analysis. The probabilistic approach furnishes a convenient form 
of representation and helps us effectively use the methods of asymptotic analysis 
that have been well developed in the theory of probability. 

Thus, in the probabilistic approach, numerical characteristics of a random com- 
binatorial object are random variables. The main characteristic of a random variable 
& is its distribution function F(x) defined for any real x as the probability of the 
event {& < x}, that is, 


F(x) = P{é < x}. 


The distribution function F(x) defines a probability distribution on the real line 
called the distribution of the random variable é. With respect to this distribution, 
given a function g(x), the Lebesgue—Stieltjes integral 


/ g(x) dF (x) 


can be defined. The probabilistic approach has advantages in the asymptotic in- 
vestigations of combinatorial problems. As a rule, we have a sequence of random 
variables &,,n = 1,2, ..., each of which describes a characteristic of the random 
combinatorial object under consideration, and we are interested in the asymptotic 
behavior of the distribution functions F,(x) = P{&, < x} asn > oo. 

A sequence of distributions with distribution functions F,, (x) converges weakly 
to a distribution with the distribution function F (x) if, for any bounded continuous 
function g(x), 


/ ay dFy(x) > f g(x) d F(x) 


asin —> 00. 
The weak convergence of distributions is directly connected with the pointwise 
convergence of the distribution functions as follows. 
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Theorem 1.1.1. A sequence of distribution functions F,,(x) converges to a dis- 
tribution function F(x) at all continuity points if and only if the corresponding 
sequence of distributions converges weakly to the distribution with distribution 
function F(x). 


In a sense, the distribution, or the distribution function F(x), characterizes the 
random variable £. The moments of & are simple characteristics. If 


i |x| d F(x) 


—00 
exists, then 
foe) 
Eé = xd F(x) 
—0o 


is called the mathematical expectation, or mean, of the random variable &. Further, 


my = E&" =|, x" dF(x) 
CO 


is called the rth moment, or the moment of rth order (if the integral of |x|" exists). 

In probabilistic combinatorics, one usually considers nonnegative integer- 
valued random variables. For such a random variable, the factorial moments are 
natural characteristics. We denote the rth factorial moment by 


mer =E&E —1)---E-r +). 


If a distribution function F(x) can be represented in the form 


Fa = f ptu) du, 


—0O 
where p(u) > 0, then we say that the distribution has a density p(w). In addition to 
the distribution function, it is convenient to represent the distribution of an integer- 
valued random variable € by the probabilities of its individual values. For , we 
will use the notation 


pe=P{E=k}, k=0,1,..., 
and for integer-valued nonnegative random variables &,,, 
ph =P{é& =K}, k=0,1,.... 


It is clear that 
CO 
Es=) kp, 
=0 


if this series converges. 
It is not difficult to see that the following assertion is true. 
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Theorem 1.1.2. A sequence of distributions { aa n = 1,2,..., converges 
weakly to a distribution { px} if and only if for every fixed k = 1,2,..., 


of > wm 


asn > &. 


If an estimate of the probability P{é > 0} is needed for a nonnegative integer- 
valued random variable é, then the simple inequality 


[o.e} foe) 
P{E> 0} = )PIE=kK} < Dk = EE (1.1.1) 
k=l k=1 
can be useful. In particular, for a sequence &,, n = 1,2,..., of such random 


variables with Eé, — 0 as n — oo, it follows that 
P{é, > O} > 0. 


Since it is generally easier to calculate the moments of a random variable than 
the whole distribution, one wants a criterion for the convergence of a sequence of 
distributions based on the corresponding moments. But, first, it should be noted 
that even if arandom variable & has moments of all orders, its distribution cannot, in 
general, be reconstructed on the basis of these moments, since there exist distinct 
distributions that have the same sequences of moments. For example, it is not 
difficult to confirm that for any n = 1,2,..., 


[oe] 
[ x"e"/4 sin x!/4 dx = 0. 
0 


Hence, for —1 < @ < 1, the function 


Po(Xx) = He 41 +a sin x1/4) 
is the density of a distribution on [0, co) whose moments do not depend on «. 
Thus the distribution functions with moments of all orders are divided into two 
classes: The first class contains the functions that may be uniquely reconstructed 
from their moments, and the second class contains the functions that cannot be 
reconstructed from their moments. There are several sufficient conditions for the 
moment problem to have a unique solution. Let 


My = i [x|" d F(x). 


—0O 


A distribution function F(x) is uniquely reconstructed by the sequence m,, r = 
1,2,..., of its moments if there exists A such that 


1 
ag Lahore (1.1.2) 
n 
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The following theorem describing the so-called method of moments is applicable 
only to the first class of distribution functions. 


Theorem 1.1.3. If distribution functions F,(x),n = 1,2,..., have the moments 
of all orders and for any fixedr = 1,2,..., 


foe) 
m” =i x" dF, (x) > m,, —_|m,| < 00, 
—00 


as n — 0%, then there exists a distribution function F(x) such that for any fixed 
PSA 2 e584 


foe) 
my -|/ x’ dF(x), 


—CO 
and from the sequence F,(x),n = 1,2,..., it is possible to select a subsequence 
Fn, (x), k = 1,2,..., that converges to F(x) asn — oo at every continuity point 
of F(x). 
If the sequence my, r = 1, 2,..., uniquely determines the distribution function 
F(x), then F,(x) > F(x) asn — o0 at every continuity point of F(x). 


Note that the normal (Gaussian) and Poisson distributions are uniquely recon- 
structible by their moments. 

To use the method of moments, it is necessary to calculate moments of random 
variables. One useful method of calculating moments of integer-valued random 
variables is to represent them as sums of random variables that take only the values 
0 and 1. 


Theorem 1.1.4. If 
Sn = &§ +e +&n, 


and the random variables &,...,&, take only the values 0 and 1, then for any 
m=1,2,...,n, 


Sa(Sn —)+++(Sp—m+1)= D> bi, +++ Bigs 
{ij cit 


where the summation is taken over all different ordered sets of different indices 
{i1, ..., im}, the number of which is equal to (”)m!. 


Generating functions also provide a useful tool for solving many problems 
related to distributions of nonnegative integer-valued random variables. The com- 
plex-valued function 


(2) = be (z) = >> pez* = Ezé (1.1.3) 


=0 
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is called the generating function of the distribution of the random variable &. It 
is defined at least for |z| < 1. For example, for the Poisson distribution with 
parameter A, which is defined by the probabilities 


Pe=—e”, k=0,1,..., 


the generating function is e*7-)). 

Relation (1.1.3) determines a one-to-one correspondence between the generat- 
ing functions and the distributions of nonnegative integer-valued random variables, 
since the distribution can be reconstructed by using the formula 


1 6 
nh=7¢% (0), k=0,1,.... (1.1.4) 


Generating functions are especially convenient for the investigation of sums of 
independent random variables. If &1, ..., &; are independent nonnegative integer- 
valued random variables and S, = | +---+&,, then 


ps, (Zz) = d¢, (z) +++ be, (Zz). 


The correspondence between the generating functions and the distributions is con- 
tinuous in the following sense. 


Theorem 1.1.5. Let {py}, n= 1,2,..., be a sequence of distributions. If for 
anyk =0,1,..., 


po > p 


as n —> 00, then the sequence of corresponding generating functions $,(z), n = 
1,2,..., converges to the generating function of the sequence { px} uniformly in 
any circle |z| <r <1. 

In particular, if { p;} is a distribution, then the sequence of corresponding gen- 
erating functions converges to the generating function $(z) of the distribution { px} 
uniformly in any circle |z| <r < 1. 


Theorem 1.1.6. If the sequence of generating functions $;(z),n = 1,2,..., of 
the distributions { py” } converges to a generating function (z) of a distribution 
{pk} on a set M that has a limit point inside of the circle |\z| < 1, then the 
distributions { pe”) converge weakly to the distribution { px}. 


Since a generating function #(z) = )-72o pxz* is analytic, its coefficients can 
be represented by the Cauchy formula 
1 b(z)dz 


1 
a nl? ” Qni Jo zt 


n=0,1,..., 


where the integral is over a contour C that lies inside the domain of analyticity of 
@(z) and contains the point z = 0. 
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Thus, if we are interested in the behavior of p, as n — oo, then we have to be 
able to estimate contour integrals of the form 


1 
G(A) = =a | g(zerl® dz, 
Tt JC 


where g(z) and f(z) are analytic in the neighborhood of the curve of integration 
C and A is a real parameter tending to infinity. 

The saddle-point method is used to estimate such integrals. The contour of 
integration C may be chosen in different ways. The saddle-point method requires 
choosing the contour C in such a way that it passes through the point zo, which is 
a root of the equation f’(z) = 0. Such a point is called the saddle point, since the 
function ® f(z) has a graph similar to a saddle or mountain pass. The saddle-point 
method requires choosing the contour of integration such that it crosses the saddle 
point zo in the direction of the steepest descent. However, finding such a contour 
and applying it are complicated problems, so for the sake of simplicity one usually 
does not choose the best contour, hence losing some accuracy in the remainder 
term when estimating the integral. 

A parametric representation of the contour transforms the contour integral to 
an integral with a real variable of integration. Therefore the following theorem 
on estimating integrals with increasing parameters, based on Laplace’s method, 
sometimes provides an answer to the initial question on estimating integrals. 


Theorem 1.1.7. _ If the integral 
oo 
G(A) = il g(tye!® dt 
—0o 
converges absolutely for some . = io, that is, 


le @) 
i Ig(t)le*°F dt < M; 


—-o 


if the function f(t) attains its maximum at a point to and in a neighborhood of 
this point 


f(D) = f(t) + alt — )? +.a3(t — > + °°: 


with az < 0; 
if for an arbitrary small 5 > 0, there exists h = h(6) > 0 such that 


S(t) — ft) =A, 


for |t — to| > 4; 
and if, as t — to, 


g(t) = c(t — t)°"(1 + O(|t — tol), 
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where c is a nonzero constant and m is a nonnegative integer, then, as ) — 00, 
GOA) = eM, —MVPectm+ 1D om + 1/2)(1 + O(1/ VR); 


where (x) is the Euler gamma function and 

> 1 
—a2 = f"()/2 
In particular, ifm = 0, then c = g(to), and as ) — oo, 


(to) 
Ga) = eto) __ S80 ean + o(1/vi)). (1.1.5) 
° Vv —f"(to)/2 mah Ci) 


To demonstrate that this rather complicated theorem can really be used, let us 
estimate the integral 


[oe 
ra+)= i xe-* dx 
0 


as A — oo, and obtain the Stirling formula. The change of variables x = At leads 
to the equation 


Pat) =atle ii ect als 
0 


Here g(t) = 1, and f(t) = —(t — 1 — logr), f(1) = 0, f’(1) = 0, f"(1) = -1. 
The conditions of the theorem are fulfilled; therefore, by (1.1.5), 


foe) 
Ga) = [ eV dt = J2n/a(1+ O(1/V2)), 
0 
and for the Euler gamma function, we obtain the representation 
PA+D) H=Vt2e4AVIr(1 + O(1/V2)) 


as A — oo, coinciding with the Stirling formula, except for the remainder term, 
which can be improved to O(1/A). 

Generating functions are only suited for nonnegative integer-valued random 
variables. A more universal method of proving theorems on the convergence of 
sequences of random variables is provided by characteristic functions. The char- 
acteristic function of a random variable € or the characteristic function of its 
distribution is defined as 


p(t) = ge(t) = Ee = / = e!* dF(x), (1.1.6) 


where —oo < t < oo and F(x) is the distribution function of &. 
If the rth moment m, exists, then the characteristic function g(t) is r times 
differentiable, and 


og (0) = i"m,. 
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Characteristic functions are convenient for investigating sums of independent 
random variables, since if S, = &| +---+ &, where &,..., &, are independent 
random variables, then 


PS, (t) = Ge, (t) -- + Pe, (t). 


The characteristic function of the normal distribution with parameters (m, o) and 
density 


eo em)? /(207) 


p(x) = 
Oo 


is eimt—o7 1? /2. 

Relation (1.1.6) defines a one-to-one correspondence between characteristic 
functions and distributions. There are different inversion formulas that provide a 
formal possibility of reconstructing a distribution from its characteristic function, 
but they have limited practical applications. We state the simplest version of the 
inversion formulas. 


Theorem 1.1.8. If a characteristic function p(t) is absolutely integrable, then 
the corresponding distribution has the bounded density 


|e cele 
PX) = 5 / e'* o(t) dt. 
—0o 


The correspondence defined by (1.1.6) is continuous in the following sense. 


Theorem 1.1.9. A sequence of distributions converges weakly to a limit distri- 
bution if and only if the corresponding sequence of characteristic functions Q(t) 
converges to a continuous function g(t) asn —> 00 at every fixed t, —00 < t < 00. 
In this case, p(t) is the characteristic function of the limit distribution, and the 
convergence p(t) — g(t) is uniform in any finite interval. 


For a sequence &, of characteristics of random combinatorial objects, applying 
Theorem 1.1.9 gives the limit distribution function. But for integer-valued char- 
acteristics, one would rather have an indication of the local behavior, that is, the 
behavior of the probabilities of individual values. To this end the so-called local 
limit theorems of probability theory are used. 

Let & be an integer-valued random variable and p, = P{& = n}. It is clear that 
P{é € Ty} = 1, where I is the lattice of all integers. If there exists a lattice Ty 
with a span d such that P{é € Ty} = 1 and there is no lattice with span greater 
than d such that P{é € IT}, then d is called the maximal span of the distribution 
of &. The characteristic function g(t) of the random variable & is periodic with 
period 27 /d and |g(t)| < 1 forO0 <t < 22/d. 
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For integer-valued random variables, the inversion formula has the following 
form: 


oa 
=e e g(t) dt. 
i 


Consider the sum Sy = &, + --- + &y of independent identically distributed 
integer-valued random variables &1,...,€j. When the distributions of the sum- 
mands are identical and do not depend on N, the problem of estimating the probabil- 
ities P{Sy =n}, as N — oo, has been completely solved. If there exist sequences 
of centering and normalizing numbers Ay and By such that the distributions of the 
random variables (Sy — An)/By converge weakly to some distribution, then the 
limit distribution has a density. Moreover, a local limit theorem holds on the lattice 
with a span equal to the maximal span of the distribution of the random variable 
&,. If the maximal span of the distribution of &, is 1, then the local theorem holds 
on the lattice of integers. 


Pn 


Theorem 1.1.10. Let &, &,... be a sequence of independent identically dis- 
tributed integer-valued random variables and let there exist An and By such that, 
as N — oo for any fixed x, 


Then, if the maximal span of the distribution of &, is 1, 
ByP{Sy =n} — p((n— An)/ By) > 0 
uniformly inn. 


Local limit theorems are of primary importance in what follows. Therefore, let 
us prove a local theorem on convergence to the normal distribution as a model 
for proofs of local limit theorems in more complex cases, which will be discussed 
later in the book. 


Theorem 1.1.11. Let the independent identically distributed integer-valued ran- 


dom variables &, &, ... have a mathematical expectation a and a positive vari- 
ance a”. Then, if the maximal span of the distribution of & is 1, 
1 (n —aN)? 
oV NPE +--+ Ev = 1} ~ —exp ——352N —>0 


uniformly inn as N > ov. 


Proof. Let 


and Py(n) = P{é. +---+éy =n}. 
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If y(t) is the characteristic function of the random variable £,, then the character- 
istic function of the sum Sy = & + --- + &y is equal to g(t), and 


gNi= D> Pyne”. 


n=—o 
By the inversion formula, 


Py(n) = =f. et oN (t) dt. (1.1.7) 


Let y*(t) denote the characteristic function of the centered random variable &; —a, 
which equals y(t) exp{—ita}. Since n = aN +a2zJ/N, it follows from (1.1.7) that 


ae 
Py(n) = al eitozVN (g*(ry) dt, 


After the substitution x = to /N, this equality takes the form 


Oe ey iy e7(g"(x/(oVN)))" dx. (1.1.8) 
20 Jno JN 
By the inversion formula, 
1 —27/2 1 rs —ixz—x?/2 
——e =— e dx. 1.1.9 
V/2n 2m J—oo : 


It follows from (1.1.8) and (1.1.9) that the difference 
1 2 
Ry = 2x (o VN Py(n) — ——e? n) 1.1.10 
N ( N Vax ( ) 
can be written as the sum of the following four integrals: 


ic ad ((o*(x/(VN)))” - err) dx, 


q 


—ixz—x? 
h=-| eXz-x°/2 gy | 
A<|x| 


hy e-™(p"(/(oVN))) dx, 


(ee 


n= f ey" (x/(oVN)))" dx, 
so/N<|x|<x0J/N 
where the constants A and « will be chosen later. 
To see that Ry > O0as N — ov, we take an arbitrary 5 > 0 and show that Ry 
can be made less than 6 for sufficiently large N. 
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For J2, we have 


|b| < i en 2 dy, 
AS|s| 


and [J5{ can be made arbitrarily small by the choice of sufficiently large A. 
Since E£, = a and Dé; = o?, for the characteristic function g*(t) as t > 0, 


we have 
252 


g*(t)=1 -* +0(). (1.1.11) 


Let gy (t) denote the characteristic function of (Sy — aN)/(o JN), which equals 
(g*(x/(oVN)))*. For any fixed x and N — oo, we obtain from (1.1.11) the 
relation 


logyy(x) = Nlogg*(x/(oVN)) 


ay 
N log (1 — oN + oc1/n)) 


2 
= -> +00), 


implying that for any fixed x as N > 00, 
n(x) > en® /2, (1.1.12) 


Moreover, as seen from (1.1.11), there exists ¢ > 0 such that, for |t| < ¢, 


2,2 
t 
Ol s1- = sere, (1.1.13) 
Using this inequality to estimate J;, we find that 
N —x2/4 
hs | g*(x/(o/N dx < [ e* /4 dx, 
acai ( /( ) A<|x|<eo VN 


and by the choice of sufficiently large A, |/3| can be made arbitrarily small. 

Let ¢ be such that (1.1.13) is satisfied and let A be large enough so that | J5| < 6/4 
and |J3| < 6/4. Let us now estimate the integrals 7; and J4 for fixed ¢ and A. Rela- 
tion (1.1.12) implies that the distribution of (Sy —aN)/(o JN) converges weakly, 
as N — oo, to the normal distribution with parameters (0, 1). The convergence 
of the characteristic functions gy (x) to the characteristic function of the normal 
law is uniform in any finite interval, and the integral 7; tends to zero as N > oo. 

For /4, we have 


4 = if le*(x/(oVN))|* dx = oN [ lpcty| dt. 
eoJ/N<|x|<no/N e<|t|<x 


Since the maximal span of the distribution of & is 1, 


max |g(t)|=q <1. 


es|t|<a 
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Hence, 
[sl < oVN20q%, 


and I4 — 0as N > oo. 
The estimates of J; and J4 show that there exists No such that |/,| < 6/4 and 
|Z4| < 6/4 for N > No. 
Thus the difference Ry tends to zero as N — oo uniformly for all integers 7. 
| 


In most applications of local theorems in this text, the distribution of the sum- 
mands of the sum Sy = & + --- + &y depends on the number of summands N. 
In such cases, there is no complete answer to the question of when the local theo- 
rem holds for S,. Even in the case of convergence to the normal law, the known 
sufficient conditions for the validity of a local theorem cannot be deemed fully 
satisfactory. Hence, for each specific distribution whose parameters depend on the 
number of summands in the sum, it is necessary to invoke the classical scheme 
given above as a model. In the hope of finding simple sufficient conditions for the 
validity of local theorems for integer-valued identically distributed summands, as 
in Theorems 1.1.10 and 1.1.11, we will often omit the particularly cumbersome 
calculations arising in estimating characteristic functions. 

If &,,..., &y are independent identically distributed random variables such that 


P{é}=1}=p and P{é;=0}=q=1-p for 0<p<1l, 


then Sy = & +--- + &y has the binomial distribution with parameters (N, p), 
that is, for any k = 0,1,..., N, 


Piy=o= e ) ota 


If Npq — ov, then the binomial distribution is approximated by the normal law. 
The following theorem, known as the local de Moivre—Laplace theorem, can be 
obtained by a direct analysis of the explicit formula. 


Theorem 1.1.12. If N — 00 and (1 +. u®)/(Npq) — 0, where 


Pema 2 
JNpq’ 


N\ k .N-k 1 -wp q—P 3 ut 
Pq = ————e 1+ 3u—u-)+0O é 
(7) / 20 N pq 6./Npq ( ) Npq 


Theorem 1.1.12 implies the well-known integral de Moivre—Laplace theorem. 
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Theorem 1.1.13. If N — co and (1 + u®)/(Npq) > 0, where 
k—Np 
u= A 
VN pq 


then 


P{Sy <k} = -»’/2 dx(1 + o(1)). 


1 u 
~ e 
V20 is 

If p — 0, then the binomial distribution is approximated by the Poisson law. 
It is well known that if N > co and Np > A, 0 <A < ~, then 


for any fixed k = 0, 1,.... The Poisson approximation is also valid if Np tends 
to infinity not too quickly. 


Theorem 1.1.14. If N > , Np > o, (1+?) p — 0, where 
k—Np 
‘(Np 2 


u= 


then 


N\ y w-e_ (Np) np 
(Jota = ag, & Ee): 


The Poisson distribution converges to the normal law as its parameter tends to 
infinity. 


Theorem 1.1.15. If (1 + u°)/A — 0, where u = (k —4)/V4, then 


Mew 1 -u2/2 u> — 3u 1+u 
= ——e 1+ ——+0 . 
Kk} V20d 6/1 r 


Sometimes it is necessary to estimate the tails of the binomial distribution in 
the form of an inequality with an explicit constant. 


Theorem 1.1.16. For any x > 0, 
P{Sy — ESy > Nx} < e724. 


1.2. The generalized scheme of allocation 


In the past three decades, the so-called generalized scheme of allocation of particles 
has been applied to many probabilistic problems of combinatorics, and many of 
the results in this text were obtained by reducing combinatorial problems to such 
a generalized scheme. 
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Consider n independent trials, each having N equiprobable outcomes, 1, 2, 
..., N. Let n; denote the number of occurrences of the ith outcome in this sequence 
of trials, i = 1,2,..., N. The random variables 7, ..., 7 have the multinomial 
distribution: If the nonnegative integers ky, ..., Ky are such that kj +---+ky =n, 
then 

n! 


Pm =hi,...,.9n =kn} = 

The situation in which the multinomial distribution arises can be described in 
terms of an equiprobable scheme of allocating particles to cells. If n particles are 
independently distributed with equal probabilities into N cells labeled 1,2, ..., N, 
then the contents of cells 71, ..., 7 have the multinomial distribution (1.2.1). 

In the scheme of allocating particles to cells yielding the multinomial distri- 
bution, the contents of cells can be obtained by independent sequential allocation 
of particles. If one does not require that the contents of cells can be obtained by 
some sequential allocation of particles, with a simple probability law governing 
the sequential trials, then any set of integer-valued nonnegative random variables 
1,--+, NN, Such that 7) +---+7y =n, can be viewed as a scheme of allocating 
n particles to N cells, and one can interpret 7; as the number of particles in the 
cell with index i, i = 1,2,..., N. 

Some probabilistic problems of combinatorics can be treated by using general- 
ized schemes of allocation in which the joint distribution of the contents of cells 
m1,-.., Nn Can be represented in the form 


Pim =k,...,.9n =kn} = Pf) =h,...,Ev =kw |r t+---+éw =n}, 
(1.2.2) 


where &),..., €y are independent identically distributed integer-valued random 
variables. 

The generalized scheme of allocating particles to cells is given by the parameters 
nand N and the distribution of the random variables &), ... , &y, which by relation 
(1.2.2) determines the joint distribution of the contents of the cells 7), ..., 7. Set 


pe=P{&) =k}, k=0,1,.... (1.2.3) 


For the random variables n),..., 7 with the multinomial distribution (1.2.1), 
relation (1.2.2) is satisfied if &; has the Poisson distribution with arbitrary param- 
eter A: 
ka 
pe= Pl =) =, 
Therefore the distribution of 7), ...,y Satisfying relation (1.2.2) for some 
distribution (1.2.3) can be viewed as a generalization of the multinomial distribu- 
tion. 


k= 0; 1,045: (1.2.4) 
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The term “classical scheme of allocation” has become common for the equi- 
probable scheme of allocating particles to cells leading to the multinomial distri- 
bution (1.2.1). The terminology of the classical scheme of allocating particles to 
cells proved to be convenient for describing a number of combinatorial problems 
where the multinomial distribution appears. Many results pertaining to the classi- 
cal scheme of allocation can be obtained by applying relation (1.2.2) between the 
multinomial distribution and the Poisson distribution (1.2.4). Introducing gener- 
alized schemes of allocating particles not only broadens the scope of convenient 
language for describing combinatorial objects, but also offers the possibility of 
applying methods based on relation (1.2.2) that have been developed to analyze 
the classical scheme. 

Let 4,(n, N) denote the number of cells containing exactly r particles in the 
generalized scheme of allocation with distributions (1.2.2) and (1.2.3). We show 
that the representation (1.2.2) can be used to study this random variable. 


Let & tee ere 3 M4 ) be independent identically distributed random variables 
whose distribution is linked with the distribution of &, ..., &y as follows: 


P(g” =k} =P =klG Ar}, k=0,1,.... 
Also let 
Spa bite tin, SY = EP + HEN. 


The following lemma expresses the distribution of jz, (n, N) in terms of the prob- 
abilities of sums of independent identically distributed random variables. 


Lemma 1.2.1. 
N P{s, =n —kr} 
P Ny=kh= c= 758 te 1.2.5 
{ur(n, N) =k} ({, jor Pr) P(Sy =n) (1.2.5) 
Proof. Let A? be the event that exactly & of the random variables &,...,&y 


take the value r. By equality (1.2.2), 


P(A, Sy =n} 


Pur(n, N) =k} = P{4y? | Sy =n} = —B— 


The lemma is derived by obvious manipulations of the numerator: The events Av 
can occur for ( ) distinct choices of random variables taking the value r; therefore 
P{A”, Sy =n} 
N es 
x P{Sy =n | 8 #r,...,€n-k #4 En-e41 =... En = 7} 


N = 
= (;, )ta =p EPs =n-—kr}. a 
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In the generalized scheme of allocating particles, there is a rather simple ap- 
proach to study the order statistics n(1) < (2) < --- < Ncyy constructed for the 


random variables 71, ..., x arranged in nondecreasing order. 
Let EM ) Pee ig be independent identically distributed random variables such 
that 


P(e —k) =P =k 6 ¢ 4}, k=0,1,..., 


where A is a subset of the set of natural numbers with P{é, ¢ A} > 0. In particular, 
if A consists of one value r, then ad = 7 , where g”) is the random variable 
defined preceding Lemma 1.2.1. Set 


A 
Agee uae eee oe 


The following lemma reduces the study of distributions of order statistics to 
that of probabilities related to sums of independent random variables. 


Lemma 1.2.2. For any positive integer m, 


m-1 (A,) (Ar) 
N P{S)°" + Sy"; =n} 
ples PS — pj pN-!_Uel Nal TS 26 
P{nim) <r} = 1 2 (Ja Ae oe P(Sy =n) , (1.2.6) 


m-1 


N a 
Pinn-m+1) Sr} = > ( ; ria PANS, 
1=0 


P(s4) + 54) — a} 


P(Sv =n) » (1.2.7) 


where A, is the set of all nonnegative integers not exceeding r, A, is its complement 
in the set of all nonnegative integers, and P, = P{& > r}. 


Proof. Let us prove (1.2.7) for m = 1. For the maximal order statistic n(w) = 

max(71,..., Nn), by (1.2.2) and the independence of £1, ..., &v, we have 
P{nww) <r} = P{m <r,....9n <7} 

P{éi <7,...,Ev <r| Sy =n} 

_ PE <r)" P{Sy =n | &1 <7,...,6n <7} 


P{Sy =n} 
By using the random variables ae er es we finally obtain 
(1 — P,)YP{Sa =n} 
P <r} = — 1.2.8 
{nww) <7} P(Sv =n} (1.2.8) 
Relations (1.2.6) and (1.2.7) for other values of m are similarly proved. t_ | 
For the joint distribution of the random variables jz,,(n, N),..., Ur, (a, N), we 


can prove the following lemma as we did in Lemma 1.2.1. 
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Lemma 1.2.3. 


P{u,,(n, N) =k, w+ +5 Ur, (n, N) = k;} 


Mg ks i ks\N—ky—--—ks 
_ Nipy epee (l= pa Pa) 


kl---k!(N —k] —---—k)! 


where s — 1,kj,...,ks,r1,...,%5 are nonnegative integers and r\,...,1s are 
distinct. 


Lemmas 1.2.1, 1.2.2, and 1.2.3 express the distributions of the random variables 
Hy(n, N) and the order statistics n(1),7(@2),---, Mv) in the generalized scheme 
of allocating particles in terms of probabilities related to sums of independent 
random variables. Obtaining limit distributions for the random variables jz, (n, N) 
and (1), (2), ---» 7(N) is reduced to applying local limit theorems for sums of 
independent identically distributed integer-valued random variables. 

We now give some examples of how combinatorial problems can be reduced to 
the generalized scheme of allocating particles to cells. 


Example 1.2.1. Consider single-valued mappings of the set X, = {1,2,...,7} 
into itself. A single-valued mapping s of the set X,, into itself can be represented 


as 
( dee Ee eae SH ) 
s= ’ 
Shi S25) ecg Sy 

where sx denotes the image of k, k = 1,2,...,n, under the mapping s. The 
mapping s may be thought of as an oriented graph rp =T(X,, Wn) with vertex 
set X, and arcs W,, = {(k, sx), k = 1,2,...,n}, where the arc (k, s;) is directed 
from k to sx, k = 1,2, ...,. The number of arcs entering the vertex k in the graph 
rs) , which is the number of pre-images of the element k under the mapping s, is 
called the multiplicity of the vertex k. 

Let &, denote the set of all single-valued mappings of X,, into itself, and 
I, the set of all graphs of these mappings. The number of elements of Z, is 
obviously equal to n”. If the uniform distribution is defined on the set XD, then we 
obtain a probability space whose set of elementary events Q2 is the set &,,; and the 
probability for any subset of Z,, is the number of elements in the subset divided 
by n”. The random mapping o is any of the n” possible mappings with probability 
P{o = s}=n",s € X,. If 
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where the random variable o; is the random image of the elementi,i = 1,2,...,7, 
then, for any s, 


P{o =s} =P{o, = 51,...,0, =S,} =n”. 
Thus the random variables 01, ...,0, are independent and take the values 1, 2, 
...,/ with equal probabilities. 

Let , denote the multiplicity of the vertex r in the random mapping 0, r = 
1,2,...,”. The quantity 7; is equal to the number of random variables 01, ..., dn 
taking the value r; thus, for nonnegative integers ki, ..., k, withkj +---+k, =n, 
the probability P{7, = k,...,1 = kn} is equal to the sum of probabilities 
P{o, = Sj,...,0, = S,} = n~”", where among s),...,5, there are exactly 
k, values equal tor, r = 1,2,...,n. The number of summands in this sum is 
obviously n!/(k,!---k,!); therefore 


ni 


P{n, =ki,....%m =k} = ———-—. 
{m1 = ky } rs Pe 


Thus the joint distribution of the multiplicities of the vertices 71,...,1n of a 
random mapping is the multinomial distribution. Taking the vertices as cells and 
the arcs going into these vertices as particles, we obtain the classical scheme of 
allocating particles to n cells with multinomial distribution of the contents of the 
cells 71,..-., %,. For the random variables 71, ..., 7, relation (1.2.2) holds: 


P(m =k1,....10 = kn} = Pl6) =h,....82 = he [Er +--+ +& =a}, 


in which &,, ..., €, are independent and identically Poisson-distributed. 

The number of vertices ,(”) in a random mapping with multiplicity 7 corre- 
sponds to the number of cells containing exactly r particles in the classical scheme 
of allocating n particles to n cells; to study these variables, as well as the order 
statistics made up of the multiplicities of the vertices, one can invoke Lemmas 1.2.1, 
1.2.2, and 1.2.3. 


Example 1.2.2. Consider all distinct partitions of n into N summands not less 
than r > 0. The number of such partitions is ae ee “). Let us define the 
uniform diseabanion on the set of these partitions by assigning the probability 
laa Sree as to each partitionn = n,+---+ny,n1,...,nN >r.Thenncan 
be written in the form 


n=m+-+-+nN, 


where the summands 71, ..., Nn are random variables. If 7),...,2N > r and 
n=nj+t+---+ny, then 


pera N= 
Pon =mi---<mv = nw) = ( N—-1 ) é 
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The general scheme of allocation corresponding to this combinatorial problem is 
obtained if we use the geometric distribution for the distribution of the random 
variables €,,..., €y: 


P(g, =k} = p*"(-p), k=nrti,..., O<p<1. 
Indeed, as is easily verified, 


ao) 


PUBL = mes =m LL bee bg =) = ( io 


since, for geometrically distributed summands, 


n—-(r-—1)N-1 


P(éi+---+éy =n} = ( N-1 )errra — p)”. 


Example 1.2.3. Note that it is not necessary for the random variables € ..., €y in 
a generalized scheme to be identically distributed. Consider the following example. 
Draw n balls at random without replacement from an urn containing m; balls of 
the ith color, i = 1,..., N. Let n; denote the number of balls drawn of the ith 
color, i = 1,..., N.Itis easily seen that for nonnegative integers 1, ..., 2 such 


that n ee NN =h, 
n\ nN 


m 5 
n 
where m =m, +---+ my. 
If in the generalized scheme of allocation the random variables &1, ..., &y have 
the binomial distributions 


P(g =k} = (“)eta — py, 


P{n) =m1,...,9N =n} = 


where 0 < p< 1,k =1,2,...,m;,i=1,...,N, then 


P{é) =7,...,En =ny |G. +-:-+év =n} = 


and the distribution of the random variables 7), ..., x coincides with the con- 
ditional distribution of the independent random variables &),...,&\ under the 
condition €; +---+&\ =n. Thus 7), ..., nN may be viewed as contents of cells 
in the generalized scheme of allocation, in which the random variables &|, ..., Ev 
have different binomial distributions. 
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Example 1.2.4. Ina sense, the graph [, of arandom mapping consists of trees. In- 
deed, the graph can be naturally decomposed into connected components. Clearly, 
each connected component of the graph I), contains exactly one cycle. Vertices in 
the cycle are called cyclic. If we remove the arcs joining the cyclic vertices, then 
the graph turns into a forest, that is, a graph consisting of rooted trees. 

Recall that a rooted tree with n + 1 vertices is a connected undirected graph 
without cycles, with one special vertex called the root, and with n nonroot labeled 
vertices. A rooted tree with n + 1 vertices has n edges. In what follows, we view 
all edges of trees as directed away from the root, and the multiplicity of a vertex 
of a tree is defined as the number of edges emanating from it. 

Let T,, denote the set of all rooted trees with n + 1 vertices whose roots are 
labeled zero, and the n nonroot vertices are labeled 1,2,...,. The number of 
elements of the set J, is equal to (n + 1)”~!. 

A forest with N roots and n nonroot vertices is a graph, all of whose components 
are trees. The roots of these trees are labeled with 1, ... , NW and the nonroot vertices 
with 1, ...,. We denote the set of all such forests by 7,7. The number of elements 
in the set Tn, v is N(n+N)"~!. The number of forests in which the kth tree contains 
nx nonroot vertices, kK = 1, 2,...,n, is 


n! n-1 -1 
—— (+: + PN, 
ni!---nyn! 

where the factor n!/(n!---nw!) is the number of partitions of n vertices into N 
ordered groups, and (nj; + 1)"*—! is the number of trees that can be constructed 
from the kth group of vertices of each partition. Then 


3 (ny + "7! ..- (ny + 1)"N7! 


=N(n+N)""!, (1.2.9) 
ni!---ny! 


n! 
nytetnnan 


where the summation is taken over nonnegative integers n1,...,y such that 
ny tes +nn =n. 

Next, we define the uniform distribution on 7,7. Let n;, denote the number of 
nonroot vertices in the kth tree of a random forest in T,,v,k = 1,..., N. For the 
random variables 1,...,7N, we have 


nl (ny + 1)"!--- (an + 1)"% 


P{n) =n1,..-.9N = = S_—_,, (1.2.10 
Mb OT Wee Nea Dian ei 
where n},...,y are nonnegative integers andn, +---+ny =n. 
Let us consider independent identically distributed random variables &,, ..., Ey 
for which 
k+1* 
pe =e a IY fOr £20,151, (1.2.11) 


(kK+1)! 


where the parameter x lies in the interval 0 < x < e~! and the function 6(x) is 
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defined as 
Kad jk-1 ‘ 
Ox ) = y a2 > 
k=1 


By using (1.2.9), we easily obtain 


(ny +1)"!--- (ny +:1)"% xe NO(x) 


P{é +---+&y =n} (ny + 1)!--- (ny +1)! 


nite-+ny=n 


_ Nat Nyt ng-NO(). 


n!} 


hence, for any x,0 <x < e7!, and for nonnegative integers n1,...,”y such that 
nyt::-+nyn=an, 


P{é) =71,...,EN =ny | G1 +--+: +éy =n} 
nln $1)" ++ (ay +1)" 
~ N(nt+ Nay + Dt (an + DE 


The right-hand sides of (1.2.10) and (1.2.12) are identical, and the joint distribution 
of 1,...,7n coincides with the distribution of &,...,€\ under the condition 
that &) +---+&y =n. Thus, for the random variables 71, ..., ny and&,...,&y, 
relation (1.2.2) holds, enabling us to study tree sizes in a random forest by using 
the generalized scheme of allocating particles into cells, with the random variables 
&,..., &y having the distribution given by (1.2.11). 


(1.2.12) 


1.3. Connectivity of graphs and the generalized scheme 


Not pretending to give an exhaustive solution, let us describe a rather general 
model of a random graph by using the generalized scheme of allocation. Consider 
the set of all graphs [[,(R) with n labeled vertices possessing a property R. We 
assume that connectivity is defined for the graphs from this set and that each graph 
is represented as a union of its connected components. In the formal treatment that 
follows, it may be helpful to keep in mind the graphs of random mappings or of 
random permutations. The former graphs consist of components that are connected 
directed graphs with exactly one cycle, whereas the latter graphs consist only of 
cycles. 

Let a, denote the number of graphs in the set [,(R) and let b, be the num- 
ber of connected graphs in [,(R). We denote by [,,v(R) the subset of graphs 
in T,(R) with exactly N connected components. Note that the components of a 
graph in T,,v(R) are unordered, and hence we can consider only the symmetric 
characteristics that do not depend on the order of the components. To avoid this 
restriction, we, instead, consider the set Th, n(R) of combinatorial objects con- 
structed by means of all possible orderings of the components of each graph from 
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T,,n(R). The elements of this set are ordered collections of N components, each 
of which is a connected graph possessing the property R, and the total number of 
vertices in the components is equal to 7. Since the vertices of a graph in T,, w(R) 
are labeled, all the connected components of the graph are distinct; therefore the 
number of elements in Ty. N(R) is equal to NV! a,,n7, where a, is the number of 
elements of the set I, v(R) consisting of the unordered collection of components. 

Now let us impose a restriction on the property R of graphs. Let a graph possess 
the property R if and only if the property holds for each connected component: 
The property R is then called decomposable. 

Set a9 = 1, bo = 0 and introduce the generating functions 


—s dax" sa Pe a 
Aq)=)>——, Ba)=) >=. 
a4 n. ar, n. 


Lemma 1.3.1. If the property R is decomposable, then 


n!} bn, +++ bn 
On=— Yo Lx, (1.3.1) 
! n!---nyn! 
nit -+tny=n 

where the summation is taken over nonnegative integers n,...,nn such that 
nyt-::-+nn =n. 
Proof. With nj +---+ny =nandny,...,nn > 1, let@,(m1,...,nNn) denote 
the number of graphs in Ij,,7(R) with ordered components of sizes nj,..., nN. 
We construct all a, (#1, ...,N) such graphs and decompose the n labeled vertices 
into N groups so that there are n; vertices in the ith group, 7 = 1,..., N. This 


can be done in n!/(n,!---ny!) ways. From n; vertices, we construct a connected 
graph possessing the property R; this can be done in b,, ways. Thus the number 
of ordered sets of connected components of sizes nj, ...,nN 1S 


n! Bn, +++ 


"N 


Qn(n1,...,2N) = ; a 
nii---ny! 


Since N components can be ordered in N! ways, the number a,,(n1,...,N) of 
unordered sets, or the number of graphs in T,, 7 (R) having exactly N components 
of sizes n1,...,nN, 1S 


1 n! by, ---b 
- : 1 nN 
Qn (n1,...,0N) = Gn(n1,...,4N) = 


N! Ninyt---nn! ee) 


Lemma 1.3.2. If the property R is decomposable, then 


A(x) = eF@), 
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Proof. As follows from (1.3.1), the number a, of all graphs in Tj, (R) is 


n 


! bn, ++ +b 

dy = - > ae (1.3.3) 
N=1 “nyteetny=n AL SANG 

By dividing both sides of this equality by n!, multiplying by x”, and summing over 

n, we get the chain of equalities 


] 
Me 
= (2 
—| 


A(x) -1 
n=l 
Se betta 
7 ine ase en 
sy (x bax .: 
NINE OI 
= be) 1, 
which proves the lemma. = 


Let us define the uniform distribution on the set I, (R) and consider the random 
variables w,, equal to the number of components of size m in a random graph 
from [7,(R). The total number of components v, of a random graph from T,(R) 
is related to these variables by v, = a; + --- +a,. Arrange the components in 
order of nondecreasing sizes and denote by £,, the size of the mth components in 
the ordered series; ifm > v,, set By, = 0. 

We will also consider the random variables defined on the set Th, n(R) of ordered 
sets of N components. The ordered components labeled with the numbers from 1 
to N play the role of cells in the generalized scheme of allocating particles. Define 
the uniform distribution on i. n(R) and denote by m,..., Nn the sizes of the 
ordered connected components of a random element in Ti: n(R). It is then clear 
that 


Nian(n,...,nn)  @n(M1,...,"N) 
P{n) =71,....9N =nn} = 2 eee (1.3.4) 
N! ann an,N 
Theorem 1.3.1. If the series 
[oe 
b,x" 
a= = (1.3.5) 
A= 


has a nonzero radius of convergence, then the random variables n, ..., nn are the 
contents of cells in the generalized scheme of allocation in which the independent 
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identically distributed random variables &, ...,&x have the distribution 


byx* 


(1.3.6) 
where the positive value x from the domain of convergence of (1.3.5) may be taken 
arbitrarily. 


Proof. Let us find the conditional joint distribution of the random variables 
&,...,&y with distribution (1.3.6) under the condition €; + --- + &, =n. For 
such random variables, 


x” bn, ++: Ony 
Pf +--+évy=nh}=——_ YO St, (3:7) 
(BO) Tuyen Ma} 
and by virtue of (1.3.1), 
x"N! 
P ae =n} = ———__ ; 1.3.8 
{Gi +---+éy =n} (Bay ain (1.3.8) 
Hence, if,,...,ny > landn, +---+ny =n, then 
P(g) =7,...,En=nn |G +---+éyv =n} 
_ Dig, oy" 
~ nyte--ny! (Bx) VP{&y +--+ + Ey =n} 
_ bn, +++ Onyn! 
~ ngts+sny!Nlann’ 
and according to (1.3.2), 
Qn(ni,...,NN) 
Bis Hind HO eee ee (1.3.9) 


Qn,N 


From (1.3.4) and (1.3.9), we obtain the relation (1.2.2) between the random vari- 
ables 7),..., 7n and &), ..., &y in the generalized scheme of allocating particles 
to cells. a 


In the generalized scheme of allocating particles, we usually study the random 
variables jz,(n, N) equal to the number of cells containing exactly r particles 
and the order statistics 7(1), 9(2), .-., M(N) Obtained by arranging the contents of 
cells in nondecreasing order. In this case, jz,(n, N) is the number of components 
of size r, and ni), (2), ---, cn) are the sizes of the components in a random 
element from jars n(R) arranged in nondecreasing order. The random variables 
help in studying distributions of the random variables a, . . . , @, and the associated 
variables defined on the set I), (R) of all graphs possessing the property R. 
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Lemma 1.3.3. For any positive x from the domain of convergence of (1.3.5), 


_ mt (BQ))" 5 


P= Ni ge ele hen Sm, (1.3.10) 


Proof. Relation (1.3.10) follows from (1.3.8) because P{v, = N} = an,n/an by 
definition. | 


It is clear by virtue of (1.3.3) that the number a, can also be expressed in terms 
of probabilities related to &,..., Ey: 


oo N 
n! (B(x)) 
i= 2 ais -P{é) +---+éy =n}. (1.3.11) 
N=1 
Lemma 1.3.4. For any nonnegative integers N,m,...,™n, 
Pla; = m,...,Q@n, =mn | vn» = N} 


= P{ui(n, N) = my,..., Un(n, N) = my}. 


Proof. The conditional distribution on Ff,(R) under the condition v, = N is 
concentrated on the set lj, 7 (R) of graphs having exactly N connected components 
and is uniform on this set. Hence, 


Pincha se ene ene (1.3.12) 


Qn,N 
where ayy is the number of elements in I, y(R) and cy(m, ...,™mpn) is the 
number of graphs in Tj, v(R) such that the number of components of size r is m,;, 
r=1,2,...,n. 
Consider the above set Fi, n(R) composed of ordered sets of N components. Let 
én (m1, ..., mn) denote the number of elements in I, y(R) such that the number 
of components of sizer ism,,r = 1,2,...,n. Itis clear that 


Cn(m,...,1n) 
P{ui(m, N) =m,..., Un(n, N) = mp) = ———"—,,__ (1.3.13) 
Gn,N 
where a, is the number of elements in ie N(R). The assertion of the lemma fol- 
lows from (1.3.12) and (1.3.13) because @n,v = N!an,w and Cy(m,...,mn) = 
Nit cn(m,..., mn). | 


Thus, if the series (1.3.5) has a nonzero radius of convergence, then all of the 
random variables expressed by a1, ..., @» can be studied by using the generalized 
scheme of allocating particles in which the random variables &), ..., vy have the 
distribution (1.3.6). Roughly speaking, under the condition that the number v, 
of connected components of the graph I, (R) is N, the sizes of these components 
(under a random ordering) have the same joint distribution as the random variables 
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n1,---,”nN in the generalized scheme of allocating particles that are defined by 
the independent random variables &), ...,€y with distribution (1.3.6). Thus, for 
V, = N the random variables £),..., By ate expressed in terms of a1, ..., Gp 
in exactly the same way as the order statistics nq), ..., Nj) in the generalized 
scheme of allocating particles are expressed in terms of 41(n, N),..., Un(n, N). 
Hence, Lemma 1.3.4 implies the following assertion. 


Lemma 1.3.5. For any nonnegative integers N, k,,..., ky, 


P{p, =k,...,Bn =kn | vp = N} = Pl{ngay) ='1,.-., mW) = ky}. 
(1.3.14) 


We now consider the joint distribution of z1(n, N),..., Un(n, N). 


Lemma 1.3.6. For nonnegative integers m,,..., my, such thatm,+---+m, = N 
andm,;+2m2+---+nmy =n, 


P{ui(n, N) =my,..., Un(n, N) = mp} 
nD a2. by" 


~ ini! mal AD --(nly"™a, Plv, = N) (1.3.15) 


Proof. To obtain (1.3.15), it suffices to calculate Cy (mj, ..., mn) in (1.3.13). It 
is clear that 


Ev(mj,...,Mn) =) an(ny,....2N), 


where the summation is taken over all sets (m1, ...,y) containing the element 
r exactly m; times, r = 1, ...,. The number of such sets is N!/(m,!---my!), 
and for each of them, by (1.3.2), 
r n! By! -..byn 
Qn(n1,...,2N) = Cp" (ny 
Hence, 

Ntntby! +. pan 
myt---myl (AY) --- (nfm ” 


To obtain formula (1.3.15), it remains to note that 


CN(M,..., Mn) = 


Gn,N Gn,N 


P{v, = N} = : 
(hr } an N! ay 


Lemmas 1.3.4 and 1.3.6 enable us to express the joint distribution of the random 
variables a1, ..., @, in arandom graph from T,,(R). 
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Lemma 1.3.7. If m,,..., mn are nonnegative integers, then 


n m 
lt b," ; n 
Ti ee Derm, =n 
Ploy = mj, ..., On = Mn} = 4 An 0 Mel (rl if ; 


0 otherwise. 


Proof. By the total probability formula, 


P{ay =M|,..., An = mn} 
N 
= Pir, = KPloy =m1,...,0n =n | Vn =H 
k=1 


= P{v, = N}P{ay = mj,...,Q@, = mn | vn = N}, 
where N = m, +---+m,. By using Lemma 1.3.4, we find that 
P{aj = m,...,@, = mp} 
= P{v, = N}P{ui(n, N) =my,...,Un(n,N) =mp}. (1.3.16) 


It remains to note that P{z1;(7, N) = my,..., Un(n, N) = my} = 0 if my + 
2m2 +-+-+nm, # n and that equality (1.3.15) from Lemma 1.3.6 holds for the 
probability P{#21(@7, N) = my,..., Un(a, N) = m,)} ifm) +---+m, = N and 
m, + 2m. +---+nm, = n. The substitution of (1.3.15) into (1.3.16) proves 
Lemma 1.3.7. | 


We now turn to some examples. 


Example 1.3.1. The set S, of one-to-one mappings corresponds to the set I, (R) 
of graphs with n vertices for which we have the property R: Graphs are directed 
with exactly one arc entering each vertex and exactly one arc emanating from 
each vertex. This property is decomposable. The connected components of such a 
graph are (directed) cycles. In this case, a, = n!,b, = (n—1)!, and the generating 
functions 


1 
A(x) = ——, B(x) = —log(1 — x) 
1-x 
satisfy the relations of Lemma 1.3.2: 
A(x) = 8), (1.3.17) 


To study the lengths of cycles of a random permutation and the associated variables, 
one can use the generalized scheme of allocating particles in which the random 
variables &|, ..., &y have the distribution 


xk 


P{é = k} = ~ klog(1 — x)’ 


k=1,2,..., O<x<1. 
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Example 1.3.2. The set &, of all single-valued mappings corresponds to the set 
(8) of graphs with n vertices with property R: The graphs are directed with 
exactly one arc emanating from each vertex. This property is decomposable. Since 
the number of elements of 4, is n”, from relation (1.3.17) for the generating 
functions we find that 


n”x” 
t 3 


B(x) = log A(x) = log ) > = 
n=0 % 


yielding 


n—-1 nk 
by =(n— IID 
k=0 
The radius of convergence of A(x) and B(x) is e~!, and at the point x = e~!, they 
diverge. 
To study the characteristics of a random mapping, we can use the generalized 


scheme of allocating particles in which the random variables &, ..., &\y have the 
distribution 
byx* Sj 
Pb = aa’ Sl 2h cs) One es 


Example 1.3.3. Consider the set of all unordered partitions of the set X, = 
{1, 2,..., m} into disjoint subsets, the union of which is X,,. The partition of X,, 
into unordered subsets Y;, ..., Y corresponds to the hypergraph of T),, v(R) with 
n vertices and N hyperedges Y},..., Yw. Since all of the N! orderings of the hy- 
peredges Y;,..., Yy are distinct, each hypergraph of I), (R) gives us N! distinct 
objects of I, v(R) that are hypergraphs with n vertices and N ordered hyperedges 
Aj,..., An, with the sets of hyperedges being permutations of Y),..., Yy. The 
property R determining this class of graphs requires that a graph be a hypergraph 
whose distinct hyperedges have no common vertices. Each connected component 
of such a graph is a hyperedge. Clearly, the number of connected graphs possessing 
the property R with n vertices is 1, that is, b, = 1, so 


CO xn 
Bix) = DO et 1. 
n=1 ~ 


Since R is decomposable, 
A(x) =e, 
This equality, or (1.3.3), yields 
a oe aarae 
2 Neil acinus nyt---ny! 


where the second summation is over positive integers n1,...,”N. 
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Thus, to study random partitions, we can use the generalized scheme of al- 
location in which the random variables é,,...,@y have the truncated Poisson 
distribution 


xk 


P{é1 = k} = ki(e* — 1)’ 


k=1,2,..., O<x<o. 


Example 1.3.4. A tree is a connected graph without cycles. As the set T,,v(R), 
let us consider the set F,,,n of all forests consisting of N trees with the total 
number n of labeled vertices. The trees in a forest are not ordered. The property R 
determining this class of graphs requires that a graph be undirected without cycles. 
The property R is decomposable. The number 5, of connected graphs possessing 
the property R is the number of nonrooted trees with n vertices and b, = n"~?, 
so the generating function is 
CO Jn-2yn 


Bx) =>." = , O<x<e!, 


n=1 


Thus, to study a random forest from F,,,, we can use the generalized scheme 
of allocation in which the random variables £,, ..., €y have the distribution 
kk-2xk 


= 1 
~ KIB(x)’ 


k=1,2,..., O<x<e™. 


P{é1 =k} 


1.4. Forests of nonrooted trees 


The graphs consisting of nonrooted trees and unicyclic components play the same 
role in investigating graphs as the forests of rooted trees do for graphs of mappings. 
Hence, the following sections concentrate on these objects, using the generalized 
scheme of allocation. 

As in Example 1.3.4, let F,,, be the set of all forests of N nonrooted trees with 
n vertices. It is known that the number of forests of N ordered rooted trees with 
total number 7» of nonroot vertices is N(N + n)"—!. In contrast to the forests of 
rooted trees, there is no simple formula for the number F;,,v = |Fn,n| of forests 
of nonrooted trees. Therefore the first step is to study the asymptotic behavior 
of Fru. 

Denote by T the number of edges in a forest belonging to F,,,n. It is easy 
to see that T = n — N. Following the general algorithm for applying the gen- 
eralized scheme of allocation, let us consider the set Fe n, which consists of N 
ordered nonrooted trees, and define the uniform distribution on this set. Denote 
by m1,..., yn the sizes of ordered trees in a random graph from Fn, n. By Cay- 
ley’s formula for counting trees, the number b, of nonrooted trees with n vertices 
is n”—*. Denote by a,(n1,...,N) the number of elements in Fy. n for which 
{n] =71,...,9N = ny}. It is easy to see that for positive integers nj,...,2N 
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with nj +---+nn =A”, 
an(n1, See ., MN) = 


agin Pa (1.4.1) 


and the number of elements in Fn, Nn is 


j : n' by, ---b 
Frv= >> &(m,...ny)= Do — 8. 


Vccpart 
ie kcrtier ectnyeg ee 


Thus, for the number of forests F;,,.7, we obtain the formula 


ni nt-2 opin -2 
Fun = mT a 1 TN x . (1.4.2) 
Said a ely 1i-° AN! 
where the summation is over positive integers n),...,ny such thatn, +---+ 
nn =n. 
Introduce independent identically distributed random variables &), ..., &y for 
which 
byx* kk-2yk 
P{é =k} = ———_ = ——., k= 1,2.,..., 1.4.3 
1 =") = trea = Bq) ve) 
where 
lo.) 
byx* kk-2xk 4 
B(x) = ote eo O<x<e. (1.4.4) 


=1 = 


In accordance with the results of the previous section and Example 1.3.4, the 
generalized scheme of allocation can be applied to investigating random forests of 
nonrooted trees, that is, relation (1.2.2) is valid: For any integers nj, ...,”N, 


P{n) =n1,...,9N =nyn} = P{& =71,...,€N =nn | &1 +--- + én =n}. 


For the number of forests F;,, 7, formula (1.3.8) is valid, which, of course, can be 
obtained directly from (1.4.2) and (1.4.3): 


n!(B(x))% 


ee P{é +---+&y =n}, (1.4.5) 


F, n,N = 
where B(x) is defined by (1.4.4), and the value of the parameter x in the distribu- 
tion (1.4.3) of the random variables &), ..., 7 can be chosen arbitrarily from the 
domain of convergence of the series B(x). 

Thus, to obtain the asymptotics of F,,,v, it is sufficient to choose an appropriate 
value of x,0 < x < e7!, and analyze the asymptotic behavior of the probability 
P{& + ---+&, =n} for the sum of the random variables &), ..., &y that have 
the distribution (1.4.3) with the chosen value of the parameter x. 
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The first two moments of the random variable &; have the following expressions: 


1 ke lyk 


Ee = SS 
*| = BG) a 
1 Qkkxk 
27S 
$i Bo 2 k! 


Therefore, along with B(x), we consider two functions 


ioe) foe) 
kky* Key & 
a Diaries = 2a 


The function 6(x) is the solution of the equation 
6e-9 =x (1.4.6) 


if we choose the solution that is less than 1. 
The functions a(x) and B(x) can be represented in terms of this function. 
Differentiating (1.4.6) gives 


6’ (x)e~°™) — 6(x)0" (xe? = 1; 


hence, 
! - 8(x) 
(x) = xd — 6) ~ OG) (1.4.7) 
On the other hand, 
xO'(x) = ae = a(x). 
k} 
k=1 
Thus 
_ (x) 
a(x) 1-6) (1.4.8) 


Slightly more complicated calculations are needed to obtain the relation 
B(x) = 4(1- (1 — 6(x))”). (1.4.9) 
Consider the function 
h(x) = (1 — @(x)). 
By using (1.4.7), we obtain 


OO pk—-1,k-1 
hl (x) = —2(1 — 6(@))6"(x) = 22) yo 
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When we integrate both sides of this equality, we obtain 


x Co 
[wo may 1= 2 7 fm lat 
0 


kh-2xk 
=> a = 728@), 


which implies equality (1.4.9). 
Relations (1.4.8) and (1.4.9) allow us to calculate the mean E&, and the variance 
Dé,. For 0 < 6 < 1, we set 


x=6e%. 


For such a choice of the parameter x, 


0 6(2-—0 
0x) =0, a@)=—5, BG)= con ), 
therefore 
“ston BOs 2 
EL a ge 
2 DE, = a(x) - (32) = 20 
~ "71 Ba) \B@)) ~ A—6)2—0)2" 


If the parameter 0 is fixed, then Theorem 1.1.11 may be applied to the sum 
ty =61 +--- +&n. 


In fact, the theorem on local convergence to the normal law is valid in a wider 
region. 


Theorem 1.4.1. If N — oo and 9 = @(N) varies such that9N — oo and 
(1 —0)3N — on, then 


Apps -w/2 
BUNA pa (1 + 0(1)) 


uniformly in the integers k such that u = (k — Nm)/(oVN) lies in any fixed finite 
interval. 


Proof. First we prove that, under the conditions of the theorem, the distribution of 

(¢v —mN)/(o VN) converges weakly to the normal distribution with parameters 

(0, 1). According to Theorem 1.1.9, it is sufficient to demonstrate convergence 

of the corresponding characteristic function gy(t) to the characteristic function 
—t?/2 eer meer: 

e of the standard normal distribution. 
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The characteristic function of & equals 


1 So omer _ B(xe'") 


B(x) 


we ll Ba)” 


k=1 
By virtue of (1.4.7), (1.4.8), and (1.4.9), 


B(x) = $(1-(1-@())’), 
x B(x) = 0(x), 
x? B"(x) = 67(x)(1 — A(x)“, 


x3 Bl" (x) =_ (x) = 20(x))A mae) 6(x))?. 


Therefore 
arden i (xe!) 
g(t) = Ba)” 
"(t) = _ 9(xe") (1.4.10) 
si . B(x)(1 — 0(xeit))’ aN 
. it 
ob) =- i (xe"*) = 
B(x)(1 — 0(xe'*)) 
For x = 6e~°, 
O(x)=6, Bix) = S ou 


Denote by y(t) the characteristic function of the centered random variable 
&1 — 0(x)/ B(x). Then 


20 


! = ” enon ey pee 
V'@)=0, WO) =-0? =. 


(1.4.11) 


g(t) = log y(t). 
It is not difficult to check that 
e 210 (xe!*) (26?(xei") — 6(xe"t) — 2) 


BO a(zei (2 —O(ue") 


Therefore, if x = @e~°, then there exists a constant c such that 


wt cb 
le"@] < qror 
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and 
242 tate 
A= = SAR 5 7 Gena Sea 1.4.12 
y(t) =e exp| 2 +? \q—o ( ) 
The characteristic function gy (t) of the random variable (¢y — mN)/(a/N) 
satisfies the equality yy (t) = Ww (t/(o /N)); hence, for any fixed t, as N — oo, 


2 

t 1 
gn (t) = exp}—~— + O | ——————— ; (1.4.13) 

| 2 (~e =-)| 
The conditions of the theorem specify that NO — oo, N(1 — 6)3 — oo; hence, 
for any fixed t, as N — oo, 
gn(t) > et ?, 

and the distribution of (ty — mN)/(aV/N) converges weakly to the standard 
normal law. 

To prove the local convergence of these distributions, we need additional esti- 
mates of the characteristic function g(t). It is reasonable to assume that the local 
theorem is valid in the same regions as the integral theorem proved above, but the 
necessary estimates are complicated to find, and therefore we restrict ourselves to 
a proof of the local theorem only in the case where 6 < 6 < 1 and@N — oo. 

From (1.4.12), it follows that there exists ¢ > 0 such that for |t| < ¢ and 
6 <9 <1, 


wit)| < eer? (1.4.14) 


We now show that for any ¢, 0 < ¢ < 7, there exists a positive constant c such 
that fore < |t| < 7 andO <@ <1, 


lot)| <e®. (1.4.15) 
If 6 — 0, then 
x = Ge° =6 —67 + O(6°), 
we) B(xe!) — xe! + x7e7H#/24 0(6?) 
ee Bay a1 — 6/2) 
= e' +(e _ e'')9/2 + O(07). 
Now 


le + (c2# _ e')@/2/? = 1—26sin?(t/2) + 0(6’), 
as 9 — 0; therefore 


lp(t)| = 1-0 sin?(r/2) + O(6"), 
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uniformly in t, and for ¢ < |t| < a there exists § > 0 and c; > 0 such that 
lp@t)| se"? (1.4.16) 


for é < 6. 
For any 6, 0 < @ < 1, the distribution of €; has maximal span 1 and g(t) is 
continuous in ¢ and @ in the region 


B= ({(t,0):€ < |t]} <2, 0<5 <6 < 1}. 
Therefore 
q = sup |g(t)| < 1, 
B 
and there exists cz > O such that 
lp(t)| < e-@° (1.4.17) 


for (t, 0) € B. 

This estimate and (1.4.16) imply (1.4.15). 

Proving the local theorem, we follow the proof of Theorem 1.1.11 as a model 
for similar proofs. We set 
_k-mN 
u= o JN” 


and represent the difference 


Ry =2n (ovivpww = =e") 


Py(k) = P{tw =} 


as a sum of the following four integrals: 


i= f° e™(We/(ovN)))" ~e* Pat, 


h=- / gi Gi 
Astt\ 


hh = e™(y(t/(oVN)))¥ at, 


oe 
I = —itu JN Ny 
: he Nebicno/N (v(e/(o ))) t 


where the constants A and é will be chosen later. 
To see that Ry — 0 as N — oo, we show that Ry can be made arbitrarily 
small by choosing of ¢, A, and N. It is clear that 


Ib| < | eat, 
As|t| 
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and |J)| can be made arbitrarily small by choosing a sufficiently large A. 
Choose ¢ > 0 such that estimate (1.4.14) is fulfilled. Then, for 6 < 4 < 1, 
It] <e, 


lv(t/(eVN))| se", 


so that 
[Z| <[ Wte/(ovN))| ar < f eo dz, 
A<|t|<eo /N As|t| 


and |J3| can be made arbitrarily small by the choice of sufficiently large A. 

For fixed A, the integral /; tends to zero because y(t) > et /2 uniformly with 
respect to ¢ in any finite interval. 

Finally, with the help of estimate (1.4.17), we obtain that as N — oo, 


(om) 


aoV/N lott) | at 


é<|t|<x 


aVNe~ON -+ 0. 


N 


|I4| < dt 


ree 


IA 


IA 


Denote by p(u; a, B) the density of the stable law with parameters a and in 
Zolotarev’s parameterization (see [60]). If ~« 4 1, the characteristic function f(t) 
of this distribution can be represented in the form 


in t 
f(t) = exp {-Ir"exp {-Fxe@e=|| , 
where K (a) = 1 — |1 — @|. By the inversion formula, 
1 = —itu a im t 
pu; a, B) = — e exp 4 —|t|* exp} -— K(@)B— } pdt. (1.4.18) 
27 Joo 2 |t| 
If N — oo and 6 = 1, then the distribution of (¢y — 2N)/(bN?/2), where 
b = 2(2/3)7/3, is approximated by the stable distribution with parameters 
a= 3/2,p=-1. 
Theorem 1.4.2. If N + 00, 0 = 1, b = 2(2/3)*/3, then 
bN7?PP {Ey =n} = plu; 3/2, -1)(1 + o(1)) 


uniformly in the integers n such that u = (n —2N)/(bN2/) lies in any fixed finite 
interval. 
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Proof. The terms of the sum fy = & + --- + &y are independent identically 
distributed random variables, and for 9 = 1, 
2Kk-2 —1 

P(E; =i} ==. as eee (1.4.19) 

and Eé, = 2, since 6(e~!) = 1 and B(e~!) = 1/2. The maximal span of the dis- 

tribution is 1; therefore, by Theorem 1.1.10, it suffices to prove that the distribution 

of (@y — 2N)/(bN/3) converges weakly to the stable law given in the theorem. 

In addition to 6(x), a(x), and B(x) defined above, we consider the function 


0° Kk-3 2k 
cCe™=) >= si. 
k=1 : 


This can be expressed in terms of 0(z): Let 
g(z) = (1 — 6(2))°. 


By using the equalities 
6(z) 


5 oa ERE 


and 
B(z) = 3(1-— (1 - 6(2))”), 
we easily obtain 
zg’ (z) = —30(z) + 307(z) = 36(z) — 6B(z). 


Integration then gives 


[sd = g(z)-1l= sf O(u)du = sf Blu)du = 3B(z) — 6C(z). 
0 0 0 u 


u 
Expressing B(z) in terms of 6(z) demonstrates that, for |z| < 1, 
C(z) = -— 41-9) — $1 — 0). 


Since 0(e7!) = 1, we find that C(e~!) = 5/12. 
Set 


u(z)=1-6(z), — v(z) = C(z)— Ce“). 
We have shown that 
v(z) = —4u?(z) — Zu3(z). 
If we invert this expression, we obtain two formal solutions 


u(z) = £2i/v(z) + 4v(z) + O(|v(z) >”); 
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1 we choose the solution 


since u(x) > Oand v(x) < 0 for0 <x <e7 
u(z) = —2i/v(z) + Fv(z) + O(|v(z) 7/7). (1.4.20) 

Hence, 
(1 — 0(z))? = u?(z) = —4v(z) -— wo’? + O(lv(z?). (1.4.21) 


The first two derivatives of C(z) are 


CO 7k-2,k-1 
k*-*z B(z) 
u = ——— ee 
C'(2) = d 7 = 


00 kk-1zk-2 6(z) 
C"(z) = Sa = ae 3 
k=1 
Therefore, for real f, 
C(e“1*") — C(e“!) = it/2+ O(’). (1.4.22) 


Now we find an expression of the characteristic function g(t) of the random vari- 
able ; with distribution (1.4.19). It is clear that 


g(t) = B(e“'**")/B(e). 
From (1.4.20), (1.4.21), and (1.4.22), we find that for z = e-1, 
1-(1-6(z))* 


ll 


p(t) 


1+ 4v(z)+ Swe)? + O(|v(z)/) 


1+ 2it + F2v2Iel mm + O(t) 


i¢\>? 
1+ 2it + [be (i (=) + O(t”), 


where b = 2(2/3)7/>. By virtue of the equality 


(-)" {z) 
i{— = —exp}—}, 
It| 4lc| 
we can rewrite the last relation as 


Hes B(e!+) 
cae Cay 


int 
= 14 2it —|bt|>/? exp (= + O(t”). 
Since 


e 7 — 1 —2it + O(t?) 
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as t — 0, we find that 


w(t) =e p(t) = 1 — |bt|*? exp {=I + o(?’). 


The characteristic function of the random variable (ty — 2N)/(bN7/3) is 


w/0n??)) = (1- TF exp [241 ow) 


and converges to 


f(t) = exp {IP exp (=I 


at any fixed ¢. The function f(t) is the characteristic function of the stable law 
p(u; a, B) with parameters a = 3/2, 8B = —1. Therefore, according to Theo- 
rem 1.1.10, as N — oo, 


bN7P {ty =n} — plu; 3/2,-1) > 0 


uniformly in k, where u = (k — 2N)/(6N2/). The function p(x; 3/2, —1) is 
positive for any x; hence, 


bN73P {ey =k} = p(u; 3/2, -1)1 + o(1)) 


uniformly in & such that u = (k — 2N)/(6N2/3) lies in any fixed finite interval. 
a 


We now turn to the estimate of the number of forests F,,,y with n vertices, N 
trees, and T = n — N edges. Theorems 1.4.1 and 1.4.2 allow us to estimate the 
number of forests. 


Theorem 1.4.3. If n + co and @ = 2T/n varies such that 9N — oo and 
N(1 — 6)3 > 06, then 


nT /1—0 
Proof. Put 
6 =2T/n, x=0e°, (1.4.24) 
By virtue of (1.4.5), 
n\(B(x))% 
nN = Whew Pity =n}, (1.4.25) 


where the parameters are chosen so that 
672-0)  2TN 
20 °° 


B(x) = (1 —(1-6)?) = (1.4.26) 
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Since m = E&, = 2/(2 — 0) = n/N, by Theorem 1.4.1, 


P{éyv =n} = (1+ 0())), (1.4.27) 


1 
oV2nN 
where 

26 _ nT 
(i —6@)(2-0)? (1-6)N2° 
If we substitute (1.4.24), (1.4.25), and (1.4.27) into (1.4.25), we can conclude that 
under the conditions of the theorem, 


n\(B(x))*./NC — 8) nt /T— 
bi ee oa = eee +o, 
N!x"./2nnT TT eT Jon 
|| 
Theorem 1.4.4. Ifn — oo and 2T/n —> 1 so that 
(1—2T/n)N'3 + B23 y/2, -c0 <v <0, 
then 
n/a 
Fu,n => NaN N6 a3 PO 3/2, -1Hd + o(1)). (1.4.28) 
Proof. Under the conditions of the theorem, 
_ n—2N -_ 1/3 n : 
= Gna =—-(1-—2T/n)N BN > -v; 
thus, by Theorem 1.4.2, continuity, and positivity of the density p(u; 3/2, —1), 
bN24P{ty =n} = p(—v;3 3/2, -1)(1 + 0(1)). (1.4.29) 


We chose 6 = 1 in Theorem 1.4.2; hence, x = e~! and B(x) = B(e™!) = 1/2. 
Having substituted these values and (1.4.29) into (1.4.25), we conclude that, under 
the conditions of Theorem 1.4.4, 


n! 
WiaNewnpnze P(—¥ 3/2, Dd + 0) 
n" [a 
* scape v; 3/2, 


Fy,N == 


—1)(1 + o(1)). 
a 


Although the density p(x; 3/2, —1) cannot be represented in terms of simple 
functions, we can use the relation p(x; a, 8B) = p(—x; a, —f) and the following 
series expansion for x > O and 1 <a < 2 for our calculations: 


1 Pn t/a), oan aed 
PGE LG) Te cos (1+ (147) p). 
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1.5. Trees of given sizes in a random forest 


Let 4, = pt,(n, N) be the number of trees with r vertices in a random forest 
with n labeled vertices and N nonrooted trees, r = 1,2.... Recall that such a 
forest has T = n — N edges. In this section, we consider the asymptotic behav- 
ior of the random variables yz,;(n, N). Following the approach established in the 
previous section, we use the generalized scheme of allocation of n particles to 
N cells determined by identically distributed random variables £,,...,&) such 
that 


2k-29k-19-kb 


P{E1 =k} = pe = x) = Han6) 


k=1,2,.... 0<6 <2. 
As we have calculated, 


2 
= =E = 
= 40) = Efi = 


and for 0 < 6 < 1, 


jee - 2 20 
o” =0(0)=Dé = 70-62" 


We will also use the notation 


(u-r? 
2 = 0) = pr (1 pe - aS pr), ro i,2.... 


The random variables 4, behave much like the corresponding variables for a 
random forest of rooted trees. We highlight some of these results; see [30] for 
a complete description. As before, let 6 = 27/n. Again the value @ = 1 is of 
particular interest, so we introduce the following notations: Forr = 1, 2,..., 


pPr@), O0<6<1, 


Ty = 1yO) = 
Pr), 150 <2, 


sr (@), 0<@< 1, 
Prd — p-()), 1<8 <2. 


Q 
| 


= = o,, (6) | 


The truncated values z,(@) and a (8) allow us to summarize the rather compli- 


cated behavior of j4,, r > 3, in the following two theorems. 


Theorem 1.5.1. Ifn, N — co andr =r(n) > 3 varies such that Na;(0) — ov, 
then 


P(u, =k} e214 + o(1)) 


1 
Sep (O)V 20 N 
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uniformly in the integers k such that 
k — N2x,(0) 
uy = ———_ 
i Orr(O)N1/2 


lies in any fixed finite interval. 


Theorem 1.5.2. Ifn, N — co andr = r(n) > 3 varies such that Nr,(0) > 2 
for some X, 0 < i < ©, then for any fixedk =0,1,..., 


kia’ 


Me 
Plur =k} = — 


(1 +o0())). 


The random variables 4; and 2, like their analogs for forests of rooted trees, 
have some special properties, but we will not discuss them. 

When edges are added sequentially to a forest, then by Theorems 1.5.1 and 1.5.2, 
the asymptotic behavior of , does not depend on 6 if 6 > 1. If Np,(1) > ov, 
then the limit distribution of ,, with similar centering and normalizing, is the 
standard normal distribution for all 9, 1 < @ < 2. 

There are similar results for the case 9 > 1 and Np,(1) — A for some A, 
0 <A < ©, with the limit distribution of the 4, for all 6, 1 < 6 < 2, being the 
Poisson distribution with parameter 2. Thus the point 6 = 1 can be interpreted as 
a critical point in the evolution of a random forest. 

We now prove Theorems 1.5.1 and 1.5.2. 


Proof of Theorems 1.5.1 and 1.5.2. According to Example 1.3.4 and Lemma 
1.2.1, 


yN-k Ploy, =n kr} 


; 1.5.1 
Pity =n ue 


N 
Pia, =k =(j )okd ~p, 


where ty = & +++ +éy, 6) = 4-0-4 ge), the random variables 


&,...,EN; Eun. ...,&y are independent and identically distributed, 
k-2,k 
=P =k = | ih in ar 
Pr {1 } KIB(x) 
foe) 
Kk-2 k 
Bix) = ee O<x<e!, 
ki} 
k=l 
P{g” =k} = Pl =k 1&1 Ar}, (1.5.2) 
and the parameter x of the distribution of &1, .. ., & may be taken arbitrarily from 


the domain of convergence of the series B(x). 
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We set 6 = 27'/n. It is convenient to choose x = 9e~° for 0 < 9 < 1 and 
x =e! for 1 < @ < 2. With these choices, (1.5.1) gives 


Ploy =n — kr} 


2 1.5.3 
P{tv =n} eee 


N 
Pu, =k} = ¢ ) (x, (0))* (1 — (0) * 


where 
P(é) =k} =m (6), k=1,2,..., 


and the distribution of ©) is defined by (1.5.2). 

Reasoning by contradiction, we see that it is sufficient to prove Theorems 1.5.1 
and 1.5.2 under the assumption that 6 lies in any of the following three domains: 
first, where N@ — oo and (1 —6)3N — oo; second, where (1 — 6)3N is bounded 
by an arbitrary constant; and, third, where (1 — 6)3N —> —oo. Negating either 
theorem implies the existence of a subsequence of the parameters n, N such that 6 
lies in one of these three domains for which the other conditions are satisfied but 
for which the conclusion is false. Therefore we assume that n, N — oo in sucha 
way that 6 lies in one of the domains and prove the assertions of Theorems 1.5.1 
and 1.5.2 in the corresponding three cases. 

Consider first Theorem 1.5.1 in the first domain of 6. By the de Moivre—Laplace 
theorem, the binomial distribution is approximated by a normal or Poisson distri- 
bution. More precisely, if N,(@) — oo, then 


1 + o(1) 27/2 


el Nee eee (1.5.4) 
J 27 ny (O)(1 — mr(8)) 


(jo (0))F(1 — 2,(0))N* = 


uniformly in & such that 


_ _ (k—Nx,)? 
* = Nx, (0)(1 — 7, (0)) 


lies in any fixed finite interval. 
The probability P{¢, = n} from the denominator of (1.5.3) has been estimated 
in the previous section. Applying Theorem 1.4.1, we have for 6 in the first domain, 


1 
where 
Bits A A a 20 
o” =o0°(6) = Dé = 4-0-6) 


To find the asymptotics of the numerator of (1.5.3), we begin by calculating the 


1.5 Trees of given sizes in a random forest 45 


first and second moments: 


m, = m,(@) = Eg = RO 
1-7z, 
2 2 
2 2 (r) o ty(u—71r) 
oO, oa; ( ) gi 1 ae ( (a = 1, )o2 


where wp = Eg, = 2/(2 - 8). 

A proof similar to that of Theorem 1.4.1 shows that a normal approximation is 
valid for the sum EU) gE”) Bee Ae More precisely, ifn, N — 00 such that 
ON — oo and (1 ON — oo, then 


(r) e7 (S-Nm,)? /(2o7-N) 1+o(1 1.5.6 
P{c¢ =s} = aN ( (1)) ( ) 


uniformly in r > 3 and s such that (s — Nm,)/ (a VN) lies in any fixed finite 
interval. 

We now use (1.5. ©) with s = n—kr and N—k summands to obtain an asymptotic 
expression for P{c\” , =n — kr}. Since 


k= Nny,+ UrOrrVN, 
where 
o;, = o7.(0) = pr(l — Pr-(u- r)’ pr /o”), 
we have 


N= NOE pS he WN =O (\-—*"" _). 1.5.7 
(1 — pr) — ura (1 — pr) (— pW ( ) 


It is easy to see that o,, /(1 — p,) is bounded, and for uw, lying in any finite interval, 
N-k=N(1—p,)(1+ O(N”). (1.5.8) 
The exponent in (1.5.6) may now be written as 


2 (n—kr —Nm,)* 


202(N —k) 
Taking into account (1.5.7), (1.5.8), and the equalities 
NectNtin, «ci cep RUT ID pe, pees EE 
1— p, 1- p, 


which hold for 6 in the first domain, we obtain 


k(m, —r) — N@m, — 2) _ py!?(u —r)(k — Nor) 
o,(N —k)l/2 * oOrr(N — k)!/2 


pr? _ 1/2 
— Pr URN) yy ayy = Pe = 


zi — pr)"/2 app N o(1— po rey: 
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Applying (1.5.7) gives 
1 
o”./2n N(1i — p,) 


7 Pr(w—r)ur/(20? (= pr) (4 + 0(1)). 
(1.5.9) 


When we substitute (1.5.4), (1.5.5), and (1.5.9) into (1.5.3), we see that under the 
conditions of Theorem 1.5.1 with 6 in the first domain, this expression transforms 
into the product of an exponent and a coefficient. The coefficient of the expo- 
nent is 


Ploy, =" — kr} = 


oV2nN 
J2aN pr — prop / 2 NO — pry) 


1 1 


2m Npr(1 = pr)(1— pre —r)2/((1 = pr)o2)) Ser 20 N 


Combining the exponents from (1.5.4) and (1.5.9) yields the resulting exponent 
(kK-—N Be 


(k—Npr)? _ pr(w—r)*(k — Np)? 
See A AS a). 
Nap) 2k — penn TY og TOD 


Thus Theorem 1.5.1 is proved for 6 varying in the first domain. 
Under the conditions of Theorem 1.5.2, k is fixed, and when we apply (1.5.5) 
and (1.5.6) with the corresponding parameters, we obtain the ratio 


Ploy, = 7 — kr} 
P{t =n} 
Therefore the assertion of Theorem 1.5.2 follows from the Poisson approximation 
of the first factor in (1.5.3). 


In the second domain, we choose the parameter of the distribution of &), ... , Ev 
to be 1. If Np,(1) — ov, then 


N k N-k 1 —22/2 
(pr(1))" Gd — pr) a (1 + o(1)) 
(i) “ = Jim Np, — pr) 
(1.5.10) 
uniformly in & such that 
= k — Np,(A) 
VNp-Q)d — pr) 
lies in any fixed finite interval. 
Applying Theorem 1.4.2 gives 
bN724P {ty =n} = pu; 3/2, -1)( + 0(1)) (1.5.11) 


uniformly in 7 such that u = (n ~ 2N)/(bN2/) lies in any fixed finite interval. 
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Restricting the random variables EC eae iM ) does not affect their maximum 

span and convergence to the stable law with density p(u; 3/2, —1). The only 

difference is that now the mean of a summand is Es” = m,(1) = 2/(1- p,(1)). 
Therefore, as j > oo, 


bj2/3 _ bj2/3 
(1 — p, (1)? ~ (l= pb)? 


xP {eo = jm) = pr? _ 1 — jm, — ray) | 


P(g? =1] 


p23 bj2/3 
= plu; 3/2, -I(1 + o(1)) 
uniformly in / such that 


_ = jm, ())0 = pd)? 
US pe 


lies in any fixed finite interval. 
By substituting N — k for j and n — kr for / and recalling that 


k= Np,(1) +z/Np-()(1 — pC), 
where z is bounded, we have 
bNPABPLe) =n — kr} = pu; 3/2, -D + 0()) (1.5.12) 
uniformly in r > 3, where, as in (1.5.11), wu = (n — 2N)/(bN7/3), since 


y= Ea ime) = prDYP _ n= 2N 


Bpe = N23 + o(1). 


Thus the asymptotics of P{¢y = n} and Pi, =n — kr} is the same and their 


ratio in (1.5.3) tends to 1. Therefore the asymptotics of P{z, = k} is determined by 
the first factor and coincides with the asymptotics of the corresponding binomial 
probability. Theorems 1.5.1 and 1.5.2 have now been proved in the second domain. 

It remains to prove the theorems for the third domain, where (1 — 27/ npN> 
—oo. We choose @ = 1 in the distribution of the random variables 1, ..., €, and 
prove that in (1.5.3) the ratio 


Pc, =n —kr}/P{ty =n} > 1 (1.5.13) 


uniformly in r, and k = Np,(1) + z./Np,()(1 — p-(1)), where z lies in any 
fixed finite interval. 

In this case, (1 — 27 /n)>N —> —o0, so the values n for the sum fy and the 
values n — kr for the sum Co; lie in what is called the region of large deviations. 
Therefore we need to apply the theorem on large deviations. We will not give the 
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proof, but the main idea is simple: If the distribution of a sum of independent 

identically distributed integer-valued random variables with zero mean converges 

to a stable law with parameter a, 1 < a < 2, then the major contribution to a large 

deviation of the sum is made by only one of the summands (see [137]). Applying 

this theorem to the sum ¢y gives the following result for 6 in the third domain. 
Ifn, N > oo such that N(1 — 27/n)> + —oo, then 


P{tw =n) = P{ty —2N =n —-2N} 
= NP{é& —2 =n —2N}(1+0(1)) 


aye N 
- (2) Goal + ol). 


The theorem given in [137] cannot be applied to the sum i ) since its sum- 
mands become noninteger after centering by the expectation m,. Britikov, using 
the method given in [137], along with ideas from [58] and [113], proved in [30] 
that the probability Pie). = n — kr} has the same asymptotics as P{tyv = n}. 


More precisely, ifn, N —> oo such that N(1 — 27/n)? — —00, then 
P(t, =n—kr} = P(t, —(N—km, =n — kr — (N—bm;} 
= (N—b#P{ée”? —m, =n — kr —(N —b)m,} 
= (N—&P{e”? =n -2N4+ O(VN)} 


gxli2 N 


uniformly in r > 1 and k such that 


(k — Np,(1))/(Npy (1) (1 — pr (1)))!/? 


lies in any fixed finite interval. 

Thus the ratio in (1.5.3) tends to 1, and the asymptotics of P{u, = k} is deter- 
mined by the first factor and coincides with the asymptotics of the corresponding 
binomial probability. This proves Theorems 1.5.1 and 1.5.2 in the third domain. 

The proof of Theorems 1.5.1 and 1.5.2 is now complete. a 


1.6. Maximum size of trees in a random forest 


The results of the previous section give some information on the behavior of the 
maximum size nv) of trees in a random forest from F;,,y with T = n — N edges. 
Indeed, if 9 = 27 /n — O and there exists r = r(n, N) such that Np,;(0) — oo 
and Np;41(8) — A, 0 < A < oo, then the distribution of the number j, of trees 
of size r approaches a normal distribution, and the distribution of 4,41 approaches 
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the Poisson distribution with parameter A. This implies that the limit distribution 
of the random variable n(y) is concentrated on the points r andr + 1. 

If 9 = 2T/n > y,y > O, then there are infinitely many r = r(n, N) such that 
the distribution of 44, approaches a Poisson distribution; hence, the distribution 
of (ny) is scattered more and more as y increases. If 0 < y < 1, then the limit 
distribution is concentrated on a countable set of integers, whereas if y > 1, then 
nN) Must be normalized to have a limit distribution, and the normalizing values 
tend to infinity at different rates, depending on the region of 6. 

Thus, it should be possible to prove the limit theorems for n(v) when T/n — 0 
by using results on yz, from the previous section. But if 2T/n — y for y > 0, 
this approach may not work, and even if it did, the proofs would not be simple. 

Therefore we choose instead to use the approach based on the generalized 
scheme of allocation. Let &),..., €\y be random variables with distribution 


r’—2gr-1e-r6 


pr) = Plé i (1.6.1) 
where k = 1,2,.... We choose 6 = 2T/n. Then, according to Lemma 1.2.2, 
(Rk 
Pin) <r} =(1— PB) a=, (1.6.2) 
where 
ty=fite tin, 9 by SRD +--+ By, 
with E”, ..., £° being independent identically distributed random variables such 
that 
P{ER? =k} =P =k lf <r}, k=1,...,7, (1.6.3) 
and 
P, = P.(0) = P{&i <r} = > me). (1.6.4) 
k=1 


We now state the theorems that completely describe the behavior of ni), deferring 
their proofs. Our procedure follows Britikov [28]. 


Theorem 1.6.1. Ifn, N > o, 0 =2T/n — 0, and the integers 
r=r(n,N)>1 
vary such that Np, (@) > o© and Np-+11(@) > A for0 <A < ©, then 


P{nww) =r} = e* + 0(1), 
Pin) =r +1} = 1-—e% +0(]). 
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Note that if A # 0 in the conditions of the theorem, then Np,(@) — co 
without any additional requirements. In particular, the conditions of the theorem 
are fulfilled if 


T/n?—-VIr _, p, 0<p<oo. 


Under this condition, Theorem 1.6.1 was proved by Erdés and Renyi [37], whose 
well-known paper provided the only results on the behavior of n(yy until Britikov’s 
work seventeen years later [28]. 


Theorem 1.6.2. Ifn, N > co, 9 = 2T/n > y, 0 < y <1, then for any fixed 
integer k, 


5/2 
_y = 1 ogy) Ges {ayyy—1-t087) 


1 1)), 
(7! =») Vin aia 


P{niw) — [a] < k} = exp 


where 
logn — 5 loglogn 
dai Se logo” 


and [a] and {a} denote, respectively, the integer and fractional parts of a. 


Theorem 1.6.3. Ifn, N — 00, 0 = 2T/n — 1, and N(1 — 0)° — 0, then for 
any fixed z, 

P{Bnyy —u<z}>e*,” 
where B = —log(@e!*) and u is the root of the equation 


4\1/2 
(=) NB? = w5/2e", 
a 


Theorem 1.6.4. Ifn, N — co such that N'3(1 —2T/n) > v, —0o < v < 00, 
then for any fixed positive z, 


P| NN) <z} +14 1 ae 3 oe 
Zz ny — —_— 
bN2 = Pp 3/2—=D st af PO 


sal 


where b = 2(2/3)7/3, 


Py — x1 — +++ — X53 3/2, -1) 
I,(w, y) = |) > a -: - xz, 
ae i; (a1 -++x5)°/? pos 


A = (G@1,--..%9)! a7 = w, GH 1,..., 5], 


and p(y;3/2, —1) is the density of the stable law with parameters a = 3/2, 
=-1. 
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Theorem 1.6.5. Ifn, N > c0, N(1 —2T/n)> — —00, then for any fixed z, 


n—2N — 1) a 
ore <2p7> p(y; 3/2, —1) dy. 
bN2/3 Lg 


We will prove Theorems 1.6.1-1.6.5 with the help of relation (1.6.2). Under 
the conditions of Theorems 1.6.1—1.6.3, 
P(E — nj /P{tv =n} > 1, (1.6.5) 


and the limit distribution of nv) is the same as the limit distribution of the maxi- 
mum of the random variables &), .. . , &. Therefore we first obtain some auxiliary 
results on the asymptotic behavior of 


P,=P-0@)= >> p-(@). 


k=r+1 


Lemma 1.6.1. Ifn, N > oo, 0 = 2T/n — 0, and the integersr =r(n, N) > 1 
vary such that Np;(8) — 00, Npr+1(9) > 4, 0 < 2 < 00, then 


NP,-1 > ©0, NP, > Ad, NP,+1 > 0. 


Proof. Under the conditions of the lemma, x = 0e~° — 0. It follows from (1.6.3) 
that 


[e-e) 
Pr+s(0) 
P= itOY = pa(O lt 1.6.6 
2, Pris) = pran( is oI (1.6.6) 
r+s @ 
Pad = pres) ee oO (1.6.7) 


Taking into account the bounds for factorials 
fre <rt< V2nr' Sree, 


we find from (1.6.1) that 


Pr+s(O) ) < e\(xe)* 


Pr+1) 7 


where c; is a constant. Hence, 


a cyxe a 
5 Pr+i@) — —xe 
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as 0 — 0. Now by virtue of (1.6.6) and (1.6.7), 


NP, = Npr4i(8) = 2 +001), 


NP,-1 > Np,-(0) > 00, = NPp41 > 0. 
|_| 


Note that if A 4 0, then Np,(@) — oo without any additional conditions, so 
this requirement may be excluded from the conditions of the lemma if A 4 0. 
Indeed, 

Pr) 
+10) 


Np,-(@) = Npr+i(@) 


Since x — 0 and 


Pr) =( r yvta(1 1 e 
Prt) \r+] x rt+l x’ 


there exists a constant cp such that Np,(9) > c2Np,r+1(0)/x and Np;(@) > ov. 


Lemma 1.6.2. Ifn, N > 0,0 =2T/n > y,0<y <1, andr=r(n,N) > 
00, then 


NP, = Np,-(6)c(1 — c)7!(1 + 0(1)), 


where c = ye!~’. 


Proof. It is clear that 


Pr+s() 
NP, = Np,(@ 
Pr( yea 8)" 
and 
Pr+s(@) _ r ce 5 
70) (. =) (xe)°(1+ O(1/r)). 


Moreover, there exist constants c3 > 0 and g < 1 such that 
Pr+s(0)/ Pr (0) < 3(xe)® < 3q°. 


Therefore the series )°°°, p++s(9)/ pr(9) converges uniformly and we can pass 
to the limit under the sum so that 


9 Pr+s(@)/Pr(@) > Ye = — 


— c 
s=l s=l1 
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Lemma 1.6.3. Ifn, N > 00, 0 = 2T/n > 1, and N(1 — 6)? > on, then for 
any fixed z, 


NP, > e 7”, 
where r is an integer such that Br = u +z + o0(1), B = —log(Ge!~*), and u is 
the root of the equation 
2 1/2 
(=) Bo? = wSl2e", 
bs 
Proof. It is clear under the conditions of the lemma that 8 = — log(de!-*) —>0 


and u — 00, since NnB?/2 — 00 by virtue of the condition N(i — 6)3 + 00. We 
apply Stirling’s formula and obtain 


OO pk—-2 yk 9\ 1/2 P e Fk 1 
= Paey eae /2 hae as 
NEN 2S G66) ) NB) GHP (: +o(7)). 


The sum 


See es 


k>r 


isan integral sum of the function f(y) = y~>/2e~ with step 6 andis approximated 
by the corresponding integral: 


fo} 
Sisk SPePig = f° ye-rdy(t +001) 
rB 


k>r 


= (rp) >/e"F(1 + 0(1)). 
Therefore 
2\1'2 
NP, = (=) NB?/? (rp) ~>/*e78 (1 + o(1)). (1.6.8) 


By definition, 78 = u + z+ (1) and 


2\ 1/2 
(=) pil? = d/o", 
IU 


Substituting these expressions into (1.6.8) yields 


NP, =e 7(1+o0(])). 


Now we are ready to prove the theorems of this section. 
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Proof of Theorems 1.6.1-1.6.3. By applying Lemma 1.6.1, we find that under 
the conditions of Theorem 1.6.1, 


Q-P)%>0, G-PB)X+e*, C-P4)Y>1 


as N — ov. These relations, together with (1.6.5), whose proof is pending, imply 
the assertion of Theorem 1.6.1. 
Let 


logn — 3 log logn 
~ @—1—logé 


a > 
and choose r = [a] + &, where & is a fixed integer. Under the conditions of 
Theorem 1.6.2, 7 = [a] + k — oo and according to Lemma 1.6.2, 


NP, = Np,(@)c(1 — c)“'(1 + 0(1)), 


where c = ye!~Y. It is easy to see that 


2Nr’—26"-1eré ne’ (1—-0+log 0) 
Np-(0) = er a ae ite in +o(1)) 


= 5/2 
= (y = 1 logy) 6 hta)n(y-1-H08 7) (4 + o(1)). 


y/20 
Thus 


_ y—1—logy)?e 


Te e &-(a)(v—1 log v) (1 +. 9(1)), 


NP, 
and consequently, 


2 
_ (y= 1 logy) | fa)yy-1-108y) 


1—P,)N = 
OI OP Ga yy lin 


(1+ 0(1)). 


Under the conditions of Theorem 1.6.3, Lemma 1.6.3 shows that NP, — e7? 
and 


‘i= Py se" 


Thus, to complete the proof of Theorems 1.6.1-1.6.3, it remains to verify (1.6.5) 
under each set of conditions. 

Since 9N — oo and N(1 — 0)? > 0, by Theorem 1.4.1 the random sum fy 
is asymptotically normal, and 


P{tv =n} (1+ o0(1)), 


1 
6 (0)/20N 
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where 
20 
2 
0) = D&y = ———___.. 
o*(6) & (1-8) —8) 

While estimating the asymptotic behavior of (1 — P,)% in Lemmas 1.6.1-1.6.3, 
we determined the choice of r. We now prove the central limit theorem for the sum 
ae for these choices of r. Set By = 0 (0)N!/2. 

The characteristic function of the random variable Ew —m(@), where m(@) = 
Eé 1> is 


fa) itk _ 
1B, SS pel Je 1B 


enitm(e) e-itm(6) ( 
k=1 


gt)- > pie") 


k>r 


where g(t) is the characteristic function of the random variable &,. Hence, the 
characteristic function 9, (t, 0) of the random variable Ee )_Nm (6))/ By can be 
written 


ettNm(@)/By 1 / ¢ abies N 
Or (t, 0) = a-BP)¥ ° (=) 1- >= pee (1+o0(1))]}. 


k>r 
According to Theorem 1.4.1, the distribution of (¢v — Nm(6))/Bwy converges to 
the standard normal law, and consequently, 

e itNm(0)/By oN (¢/ By) > e€/2, (1.6.9) 


It is clear that 


i i 1 
YD pr(oyeltt/By = P, + > p,(o)(e*/2" — 1) = P, +O & in.0) 


k>r k>r k>r 


and it is not difficult to prove in each of the three cases that 


1 
a Yo kpe(6) = o(1/N). (1.6.10) 


k>r 
Estimates (1.6.9) and (1.6.10) imply that for any fixed ¢, 


gr (t, 0) > et 2, 


and the distribution of (rae — Nm(6))/By converges to the standard normal 


distribution. The local convergence 


P{iy =n} (1 + 0(1)) 


1 
~ 0(0)/2nN 
needed for the proof of (1.6.5) can be proved in the standard way. 
Thus the ratio in (1.6.5) tends to 1, and this, together with the estimates of 
(1 — P.)%, completes the proof of Theorems 1.6.1-1.6.3. | 
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To prove Theorem 1.6.4, the following lemma is needed. 


Lemma 1.6.4. If N — 00, the parameter @ in the distribution (1.6.1) equals 1, 
NB —27 /n) > v,andr=2zN 2/3, where z is a positive constant, then 
bN7ZAPLEM =n} = f(z, v) +0(1), 


where 
3/2 


Zz le | 3 \F 
fe. =e || (o0:3,-0 +35 (-z3z) eo), 
and I,(z, y) is defined in Theorem 1.6.4. 


Proof. As N > ov, 


Wr -2e-k 2 1/2 _ 
Pk = pr) = a = (=) k 5/2] + o(1)) (1.6.11) 


uniformly in k > r. 
It is clear that 


as itk ) 1 3 k \ O°? | _itk 1 
fom RP EN 2 ~ BSN Oe BN2/3 P bN2/3 | bN23° 


The last sum is an integral sum of the function y~*/2e” with step 1/(bN2/3); 
hence, 


a pi exp | ast ones (2 ye dy + o(t)) . (1.6.12) 
k5/2 bN2/3 b3/2N g 


k>r 
Set 
3 as 5/2 jit 
H(t, z) = ——= cs MY dy. 
(t, z) cl y Ve dy 
Then 
3 fo) B 2-3/2 
Ht, < H(0, z) = —— Tle dy = ; 1.6.13 
| H(t, z)| < H(0, z) caf” eae: ( ) 


Taking into account b = 2(2/3)7/3, we obtain from (1.6.12) and (1.6.13) that 


y itk 2 S 1 itk 1 
PROP VonBY ne) 22? aneBl TOL 


k>r 
H(t, z) + 0(1) 
"sae aqgr 7 _ 


(1.6.14) 


In particular, 
NP, = H(O, z)Q + 0(1)). (1.6.15) 
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The characteristic function g, (t, 1) of the random variable (¢”” — 2N)/(bN2/3) 
can be written 


N 
Ort T) = (‘ (saa :) - ese saan | ecw | sms} 
>r 
x(1— PN, 


where g(t, 1) is the characteristic function of §; — E&,. Note that E§, = 2 in this 
case. It follows from (1.6.13), (1.6.14), and Theorem 1.4.2 that 


N 


t 
or(t, 1) = 0 (a i)(\- E mew| sts} aro) a— FR) 


k>r 


_ H(t, z) iy" 
=v (1- el) 
H(0,z) 1 eas 
x (1 = N + 0) (3) ’ 


where y(t) is the characteristic function of the stable distribution with density 
p(y; 3/2, —1). Thus, for any fixed t, as N — oo, 


gr(t, 1) > git, z) = W(t) exp{—H(, z) + H(O, z)}. 


The function g(t, z) is continuous; therefore, by Theorem 1.1.9, itis a characteristic 
function. Since |g(t, z)| is integrable, it corresponds to the density 


ae ee 
S(z, y) = = | e~"'Y g(t, z) dt. 
—OO 


The span of the distribution of £{” is 1; therefore, by Theorem 1.1.10, the local 
convergence is valid. 

Thus it remains to show that f(z, y) has the form given in Theorem 1.6.4. 
Representing e~“.”) by its Taylor series gives 


JG; y=eF® pycus £502, 9), (1.6.16) 
s=0 


where 
A(z, y) = =f ew (t) H(t, z) dt. 
20 Joo 


It is easy to see that the function 2./m23! 2H (t, z) is the characteristic function of 
the distribution with density 


py) =32Py SP, y>z. (1.6.17) 
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Therefore the function 
(2va2?) vinnie, z) 


is the characteristic function of the sum 6 + 6; + ---+ Bs of independent random 
variables, where f has the stable law with density p(y; 3/2, —1) and fj, ..., Bs are 
identically distributed with density p,(y). The density of the sum 6+ 61 +---+ Bs; 
is 


hs(y) = (32°/7)"Is(t, »), 
where /;(t, y) is defined in Theorem 1.6.4. Thus 


1 Of” o-ter yoy HS = (32) 
on [se Vater es wndr= (=e) I;(t, y). 


When we substitute this expression into (1.6.16), we find that 


1 cy 
f(z, y) = OS" — (-z) I(t, y). (1.6.18) 
=! 4/n 
= 


Taking into account (1.6.15), Theorem 1.4.2, and (1.6.18), we see that Theo- 
rem 1.6.4 follows from (1.6.2). 

To prove Theorem 1.6.5 with the help of (1.6.2), we need to know the asymptotic 
behavior of large deviations of P{Z ‘ ) =n}. We give that information without proof 
(see [28]). 


Lemma 1.6.5. Ifn, N — 00, the parameter 0 in the distribution (1.6.1) equals 1, 
Na - 2T/n)? > —00, andr =n —2N — bzN?/3, where z is a constant, then 


=(r) 2\1/2 N 00 
Pin =n)=(=) aw | ume eaae 


The assertion of Theorem 1.6.5 follows from (1.6.19), Theorem 1.4.2, and the 
fact that N P, — 0 under the condition of Theorem 1.6.5. 


1.7. Graphs with unicyclic components 


A graph is called unicyclic if it is connected and contains only one cycle. The 
number of edges of a unicyclic graph coincides with the number of its vertices. Let 
U, denote the set of all graphs with n vertices where every connected component is 
unicyclic. Any graph from U4, has n edges. In this section, we study the structure of 
arandom graph from U4,,. We follow the general approach described in Section 1.2. 

As usual, denote by u, the number of graphs in U4, ; we will study u,, asm — oo. 
Let 5, be the number of unicyclic graphs with n vertices, and Bi”? the number of 
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unicyclic graphs with n vertices, where the cycle has size r. The cycle of a unicyclic 
graph is nondirected; in other aspects, a unicyclic graph is similar to the connected 
graph of a mapping of a finite set into itself. Let d,, be the number of connected 
graphs of mappings of a set with n labeled vertices into itself, and av ) the number 
of such graphs with the cycle of size r. It is easy to see that 


n 
q) nr) d? = af) —p7t-le nr? 


BP =d?, BM ad, BM HAM 2, rB3. 


Introduce the generating functions 


by x” my dqx” Ont 2yn 
Bix) = ) ar d(x) = y air e(x) = ) at 
n=1 n=l . n=1 i 


These functions can be represented in terms of the function 


oO Un-lyn 


e@) =H, 


n=1 


which is the root of the equation @e~° = x in the interval (0, 1]. This function was 
used in Section 1.4. Taking into account the notation introduced here and using 
the results of Section 1.4, we see that 


d(x) = —log(1—O(x)), — e(x) = $(1- (1- 0(@))”). 


Since b, = bo) +... +5, we have 


fo) fore) co (1) oo (2) 
byx” 1 dn,x" 1 y, hate cL | dn x” 
Be) = OF = 525 n! +3) n! sD n! 
n=1 n=1 n=1 n=l 
= <d(x) +(x) — xe(x) 
= 2 x x 2° x 
1 1 5 
= — 5 log(1 — O(x)) + O(x) -— ql — (1 -0(x))’). (1.7.1) 
In accordance with the general model of Section 1.4, let us introduce independent 
identically distributed random variables &, ..., &y for which 
byx* 
P{é, =k} = ——.,, k=1,2,.... 1.7.2 
{§1 =k} EIBG) (1.7.2) 


The number of graphs in U/,, with N components can be represented in the form 


! bp ee obe 1(B(x))% 
= yy a i acd GLY Pe 


Un N — = 
: ! ni!---ny! N'x" 
nyte-+ny=n 1 nN 


(1.7.3) 
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In what follows, we choose 


x= (1—1/Vn)el-V/v" 


Theorem 1.7.1. Asn — ov, 


J 27 e3/4 


_ VET n-l/4 
= ara +e), 


Un 
where 
oe 
r(p) = [ xP-le-™ dx 
is the Euler gamma function. 
Before proving Theorem 1.7.1, we will prove some auxiliary results. 
Lemma 1.7.1. For x = (1 —1//nye!-"/v", 
(1 — (xe)? = : —2it +e1(t) + e2(t,n), 
where €(t)/t + Oast — 0 uniformly in n and \e(t,n)| < 2|t|/./n. 


Proof. We found in Section 1.4 that 


oo kk-2 yk 
u(w) = (1-0(w))? =1-2)> =1-2c(w), |w|<er!. 
k} 
k=1 
When we write u(w) as 
j : 1 
u(xe"’) = u(e't't) +— +&0(t,n), (1.7.4) 

n 


it is clear that @(x) = 1—1/./n and @(e7!) = 1; therefore u(e~!) —u(x) = —1/n. 
With this equality and the observation that x < e7!, we obtain the estimates 


ler(t,n)| = |u(xe'*) — u(e7!*") — 1/n| 


oo ke(e* _ xk) (eith _ 1) 
=9 > acai aaueeaneet, 


k=1 
= ie (i — x*) |t] 


2», ki 


k=1 


2|t|(A(e~') — A(x)) =alt|/ Vn. (1.7.5) 


IA 


1.7 Graphs with unicyclic components 61 


The function u(e— lit ) has the first derivative —2i at the point t = 0; thus, as 
t— 0, 


u(e“!*"') = —2it + oft). (1.7.6) 
The assertion of the lemma follows from (1.7.4), (1.7.5), and (1.7.6). | 


Lemma 1.7.2. Ifn — oo, N = alogn+o(logn), where a is a positive constant, 
then 


nP{é, +--+» +éy =k} = 2*1e-7/2(] 4 o(1)) 


2°T (a) 
uniformly in k such that z = k/n lies in any fixed interval of the form 0 < zo < 


Z<21 < &. 


Proof. The characteristic function of the sum (&;+- --+éy)/n is equal togy(t) = 
g (t/n), where g(t) is the characteristic function of 1. It is clear that 


g(t) = B(xe"')/B(x). 


Lemma 1.7.1 and equation (1.7.1) give 


1 — 2it 
n 


4B(xe"/") = —log +3+0(1). 


Therefore 
t\ _ B(xe’/") _ logn —log(1 — 2it) +3 + 0(1) 
x ~~ Ba) logn +3 + o(1) 


log(1 — 2it) + o(1) 
logn 


=1 


> 


and if N = a logn + o(1), then for any fixed ¢, 


__ log( — 2it) + am)" _ 1+o(1) 


— oN a Been hase 
on(t) = 9% (t/n) = (1 a =a one 


and the distribution of (€; + ---+&y)/n converges weakly to the distribution with 
density 


1 zal p-2/2 
2°T (a) : 
that is, to the chi-square distribution with 2 degrees of freedom, which corresponds 
to the characteristic function (1 — 2it)~. 
The local convergence can be proved in the usual way by using Lemmas 1.12.3- 
1.12.7 from [78]. a 
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Let un, be the number of graphs in U4, with N components and 


An = iG = E 
ee os n, “= : 
. 4 8 An 


Lemma 1.7.3. If n — oo, then 


_ JV2n e3/4 nade 
21/47 (1/4) N! 


Un,N (1 + 0(1)) 
uniformly in N such that |u| < (log n)!/4. 
Proof. It is clear that 


n! bn, +++ Bay 
iM Diy Sea 


NY gee yag MUON! 
nl (B(x))% 
= =e hl +.» +éy =n). (1.7.7) 


By putting a = 1/4 in Lemma 1.7.2, we obtain 


nP{é& +---+é&y =n} e/2(1 + 0(1)) (1.7.8) 


1 
~ 21747 (1/4) 


uniformly in N when |u| < (logn)!/4. 
The assertion of the lemma follows from (1.7.7) and (1.7.8), since 


B(x) 


qlogn + i + o(1), 

x" =e" "(1 + o(1)), 
(B(x))™ = r%e7/4(1 + 0(1)). (1.7.9) 
| 


The assertion of Theorem 1.7.1 can be obtained by summing u,,n over N. 
Lemma 1.7.3 estimates u,,n for N close to A,. The following lemmas give esti- 
mates of u,,n for the other values of N needed in the proof. 


Lemma 1.7.4. For any fixed a, a, 0 < a9 < a < 00, there exists a constant 
c, such that for aglogn < N < aj logn, 

NWA 
n—-1/4 An satis 


u <c 
nN cin N! 


Proof. It follows from Lemma 1.7.2 that there exists a constant A such that 


nP{é +---+éy =n} < A (1.7.10) 
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for ag logn < N < a logn. Indeed, if (1.7.10) did not hold, then a sequence of the 
parameters n —> oo, N = a logn+o(logz) would exist for which the assertion of 
Lemma 1.7.2 would not be true. Lemma 1.7.4 then follows from (1.7.7), (1.7.9), 
and (1.7.10). a 


Lemma 1.7.5. If N < logn, then there exists a constant cz such that 


nP{& +---+éy =n} < c2logn. 


Proof. It is well known that 
dm = (m — 1)! eu me 
k=0 


Indeed, since the number of forests with n nonroot vertices and N rooted trees 
labeled 1,..., Nis N(n+N)"7!, the number dv ) of connected graphs of mappings 
of an m-set into itself with the cycle of size r can be represented as 


| 
d) = ("\e —1)!rm"—"-1 = sida ; 


(m —r)! 


Here (”") is the number of possible choices of r vertices that constitute the cycle; 
(ry — 1)! is the number of cycles that can be constructed from r vertices; and 
rm™—'—| is the number of forests with r cyclic vertices as the roots. Hence, 


m 
So 
Asm — 00, 
dm = 4(m — 1)!e™(1 + 0(1)), 
and there exists a constant c3 such that 
bm <dm < ¢3(m — 1)!e™. 


Moreover, B(x) = logn(1 + o(1))/4 and x” < e~” for all m > 0. Therefore 
there exists a constant cz such that 


P(g) =m} < 


(1.7.11) 
at 


It is clear that 


N 
{G++ +éy =n} = U U G =% &t+---+év-& =n- hh. 
i=1 k>[n/N] 


64 The generalized scheme of allocation and the components of random graphs 


Since P{&, = k} decreases as k increases, we have 
P{éi+---+énv =n} 


<N DO Pl =HP+---+év =n -#) 
k>[n/N] 


< NP{& =[n/N]} ° Pl +---+év=n-h 
k>[n/N] 


< NP{& = [n/N]}. 
The lemma now follows from (1.7.11). | 


Lemma 1.7.6. For N < logn, 
1/4 ANe~* 
Un.n < can”—!/ logn— 
where cq is a constant. 
This lemma follows from (1.7.7), (1.7.9), and Lemma 1.7.5. 


Proof of Theorem 1.7.1. Roughly speaking, up,v = cA e~* /N', where c does 
not depend on N, and to obtain u,, we sum the Poisson probabilities whose sum 
is 1. To do this rigorously, we divide the sum 


lee) 
un = = Un,N 
N=1 
into four parts. Recall that u = (N — An)/V/An. Let 


Si=Jounn, &= oun, S=Dounn, Se= > unn, 
Ay A2 A3 Ag 


where 
Ay = {N: |u| < Gogn)'/*}, 
Ay = {N: |u| > (logn)!/*, aglogn < N <a log n}, 
A3 = {N: N <aglogn}, 
Aq = {N: N > q logn}. 
Asn > oO, 
pS ne =1+0(1); (1.7.12) 
i 


therefore it follows from Lemma 1.7.3 that 


J 27 e3/4 


= n-1/4 
= aAraya" (1 + 0(1)). 


Si 
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It remains to show that S2, 53, and S4 are o(n!-1/ 4), Lemma 1.7.4 implies that 


Sy < cnt t/4 1-)> n , 
Ay 


and it follows from (1.7.12) that S; = o(n"~!/4), 
To obtain an estimate for 53, we use the inequality 


> wNeWA Am eA 
<m 2 
t 7 1 
ines N! m! 
which is true for m < A. Choose ag < 1/4 such that 
ao — ag logayp — aglog4 < 1/8. 


Then, for m = ao logn, 


m! ni/8? 
where cs is a constant. By using the estimate from Lemma 1.7.6, we find that 
S3 < cacgsn"—'/4-1/3 logn. 

To obtain an estimate for S4, we use the inequality 


kc) aie 


_ aN enn 
con" 1/4,,0n © 


Nix" N° 
where c¢ is a constant, which follows from (1.7.7) if P{é) + --- +&y = n} is 
replaced by 1. Form > A, 


(1.7.13) 


Choose a; > 1/4 such that a; —a; loga; —a, log 4 < —2. Then form = a logn 
and A, = (logn)/4, we have the estimate 17? /m! < n~*; thus (1.7.13) implies 
that Sy < cen"—9/4. 

The assertion of the theorem follows from the estimates obtained for S,, S2, 3, 
and $4. |_| 


We denote the number of components in a random graph of U4, by x,. The 
following theorem is a direct corollary of Lemma 1.7.3 and Theorem 1.7.1. 


Theorem 1.7.2. Asn — ©, 
2 


J 2x logn 


uniformly in N for which u = (N — }logn)/,/ 4 logn lies in any fixed finite 
interval. 


P(x, = N} = e21 + o(1)) 
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Indeed, Lemma 1.7.3 and Theorem 1.7.1 imply that 


New*n 
P(x, = N} = au = A (1 + 0(0)) 
j 


uniformly in |u| < (logn)!/4, where 4, = ; logn. 
We now consider the maximum size f,, of the components of a random graph 
from U,,. 


Theorem 1.7.3. If n — 00, then for any fixed y,0 < y <1, 
{=1)* 
Pi <yn}= DP FWs(l.y) +0), 
O<s<l1/y 
where Wo(x, y) = 1, and for s = 1,2,..., 


dx, ---dXxs 
Wz, ¥) = if at 
(xj2y, i=1,...45, x1bbxsSz} X11 Xs(Z — Xp — +++ — Xs) 


Proof. To study f,, we use the general approach of Section 1.2. Let m1, ...,N 
be random variables with distribution 
P{n) =71,...,9v =n} = P(E] =71,...,Ev =n | 8 +--+ Ev =n}. 
(1.7.14) 


It follows from (1.7.7) that these variables can be interpreted as the sizes of the 
ordered components of a random graph from U4, (see Section 1.2), in which x, is 
N. Therefore 


[oe 
P{B, < yn} = >) Plen = N}P(n~ < yn}, (1.7.15) 
N=1 
where 0 < y < 1 and my) = maxi<j<y ni. By Lemma 1.2.2, 


P{éi+---+év=n} 


an N 
Pinan Sym}= (Pt Sym)" Gat ey aay’ 


(1.7.16) 


where &,...,& are independent identically distributed random variables for 
which 


P{&, =k} = P{& =k | & < yn}, 


and the random variables &, ..., & have distribution (1.7.2). We now estimate 


byx™ . 
Ayn (t) = De a elth/n 
k>yn 
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for x = (1—1//n)e7!+1/V¥", By (1.7.1) for any fixed y,0 < y < lasn > ©, 
1 byx* 


3 ! 
2 si k! 


Ayn(t) = etl + ofl), 


Let us prove that 
Ayn(t) = Aly, t) + o()), 


where 


1 f° ; 
Aty,t= if yale (-2ityu/2 gy 
Y 


It is easily seen that 
(1—1/Vn)*ek/V™ = e/2M (1 + o(1/Vn)), 


k-l k™e-k 


1 
2m 


uniformly in k > yn. Therefore, asn — ov, 


: itk mes 
Hyn(t)= D7 \l-= 
yn(t) Eat +) le mide 
_1 ye e A-2NK/ 2” (1 + o(1//n)). 
2 oF 


This sum is an integral sum of the function u~!e(—2/)"/2 with step 1/n. Hence, 
1° ; 
Hyn(t) = 5 / ute G-2)u?2 dy + o(1) = Hy, t) + o(1). 
y 


In particular, we obtain the following estimate for the tail of the distribution (1.7.2): 


1 byx* 
P{&| > yn} = —— = 
B(x) fore k! 
4 n 1 > 
= 4 Fyn (0) + 0(1) = 4H(y, 0) + o(1) (1.7.17) 
logn logn 


asn —> oo. 
We now find the limit distribution of the sum (&| +---+ &y)/n. The character- 
istic function of & /nis 
(t/n) — Ayn(t)/ B(x) 
v0 = g(t/ yn / 
_— Ayn (0)/ B(x) 


Using the estimates 


g(t/n) = 1 — log(1 — 2it)/logn + o(1/logn), 4B(x) = logn+ O()1), 
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from (1.7.16) and (1.7.17), as n — oo, yields 


ren ee log(1 — 2it) — 4H(y, 1) +0(1)\ (, _ 44(7,0) + 0(1) ae 
logn logn ; 
and for any fixed t and N = 1 logn + o(logn), 
WN > gy) = (1 — it) Me BOF HO), 


When we expand e~“(”-) into its Taylor series, as we did in the proof of Lemma 
1.6.4, we find that the characteristic function ¢, (¢) corresponds to the density 


y ney. 


4551 
<s<l/y ** 


eH(y.0)~2/2 


WO = TAF 
Thus, for any y,0 < y < 1,the distribution of (&, +- --+&y)/n converges weakly 
to the distribution whose density is f,(z) asm > co and N = } logn + o(logn). 
We can show that local convergence of these distributions holds. If n — oo and 
N = 4 logan + o(logn) and 0 < y <1, then 


aP{& +---+8y =k} = f(z) +0(1) (1.7.18) 


holds uniformly in & for which z = k/n lies in any given interval of the form 
0O<z<72<72 <M. 
Using (1.7.17), we find that for n > 00 and N = } logn + o(logn), 


i 4H (y, 0) + o(1) 


N 
) =e 4%) + 61). (1.7.19) 
logn 


Pie sya = (1 


Substituting estimates (1.7.19), (1.7.18), and (1.7.8) into (1.7.16) gives 
(-1)* 
Pinny <yn}= >> Far Ws v) + 000). (1.7.20) 
O<s<1/y be 


To obtain the distribution of £,, we need to average the distribution of n(w) with 
respect to the distribution of x,. By Theorem 1.7.2, the number of components x», 
is asymptotically normal with parameters (4 logn, i logn), and for N = i logn+ 
o(logz), the probability P{n(~) < yn} is asymptotically constant; therefore the 
assertion of the theorem follows from (1.7.15). a 


Denote by U4, 2 and U,,,3 the sets of all graphs with 7 labeled vertices consisting 
of unicyclic components where each cycle has more than one or more than two 
vertices, respectively. It is not difficult to see that we can treat U4, i, i = 2,3, in 
the same way as U4, (which, following the above notation, we have to denote by 
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Un,1). The role of B(x) for Ui, i = 2, 3, is played by the generating functions 


By ix” 
ni see: 
Bix)= I, 1 = 2,3, 
n=1 
where b,,; is the number of unicyclic graphs with n vertices and cycle lengths not 
less than i. 


It is clear that 


[o.¢) 1 co 
2 
bn.2 — don = d‘ ) + 2th, 


1 1 1 
Bo(x) = ~Zelx) + d(x) = = -d- 6(x))") er log(1 — 4(x)), 
and for x = (1—1//n)e"!+!/vn, 
1 1 
Bo(x) = ri logn — 4 + o(1), 


(Bo(x))" = aANe~/4(1 + 0(1)). 


Similarly, 
foe) 1 foe) 
bua =) bP = ae 
r=3 r=3 
1 
B3(x) = 37@) — O(x) + c(x) 


-5 ES Oe, (1 —~(1-6(x))), (1.7.21) 
and for x = (1 — 1/fnjye tt / va, 


1 3 
B3(x) 4 logn — 4 + o(1), 


(B3(x))" = ae3/4(1 + 0(1)). 


Therefore, if n — oo, then for the numbers u? of the graphs in U4, ; and for the 
number re of such graphs with N components, we have 
u® = Ain"—/4(1 4+ 0(1)), 


N 7A, 
Ane 


un = Ai n"™—"/41 + 0(1)) 


uniformly in the integers N such that |N ~—A,|/./A, lies in any fixed finite interval, 
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where 
J 27 e3/4 V/21re JV2ne73/4 

Ay= 2= 3= =. SECéF..7.22) 
2/47 (1/4)’ cca 21/47 (1/4) 


Theorems 1.7.2 and 1.7.3 are valid for the random variables x, and £, in U,,2 
and Un2- 


1.8. Graphs with components of two types 


The generalized scheme of allocation can be used in the investigations of random 
graphs with nonhomogeneous structure. Consider the set A,,7 of all graphs with 
n vertices and T edges where each connected component contains no more than 
one cycle. As usual, we assign equal probabilities to the elements of A,,7 and 
consider a random graph with values from A, 7. Since any graph from the set 
Ay,r consists of trees and unicyclic components, we can use the results of the 
previous sections to study various characteristics of a random graph from A,r. 
Consider first the number of elements in A,,7. As in the previous sections, we 
will denote by a, the number of graphs under consideration with vertices and 
by b, the numbers of connected graphs under consideration with n vertices. 

Instead of A,,7, we will use, where necessary, the notation A if cycles 
of lengths 1 and 2 are allowed; A® ‘rif cycles of length 1 are forbidden: and 
ae if oe of lengths 1 and 2 are forbidden. Denote the number of graphs 
in "40, by a‘ . and preserve the notation a,,,7 if the specialization is not needed. 
In accordance ith the previous sections, the number of forests with n vertices, T 
edges, and N = n — T trees is denoted by F;,,v. We use yu? to denote the ere 
of graphs with vertices and unicyclic components if they are included in Ae T 
i = 1, 2, 3, and preserve the notation w,, for the number of such graphs in A,, ,7 if 
the specialization is not important. 

It is clear that 


n 
n 
Gar = >, ( ) om Fm (1.8.1) 
m 
m=0 
Theorem 1.8.1. Ifn, T — oo such that T/n — 0, then 
2r 


n 
an,7 = F,,n(Ql+o0(1)) = 


arn! + o(1)). 


Proof. It follows from Theorem 1.7.1 that there exists a constant c; such that 


Um <cym™—"V/4_ (1.8.2) 
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Theorem 1.4.3 shows that under the conditions of Theorem 1.8.1, 
ner 

eee 

The condition 7/n — 0 implies that (T — m)/(n — m) — 0 uniformly in m, 


0 < m < T. Therefore, under the conditions of Theorem 1.8.1, there exists a 
constant cz such that 


Fun (1+ 0())). (1.8.3) 


c2(n — m)2{T—m) 


< Som T om ay (1.8.4) 


Fn—m,N 


for allm,O0<m <T. 
We obtain from (1.8.1), (1.8.2), (1.8.3), and (1.8.4) that 


T 
n 
Fin + y (on Fram 
m= 


an tT = 
Tim" ( 2eTn \™ 
= — | —— : 1.8.5 
c(-oES2HN)) 0 
This completes the proof because 27 n/(n — T)* — 0. a 


Let @n,r be the number of vertices contained in the unicyclic components of 
the random graph in A,r. It is easily seen from Theorem 1.8.1 that ifn, T — co 
and T/n — 0, then 


P{wn,7 =0} > 1, 


and the limit distributions of the number of trees of fixed sizes in a random graph 
from A,,7 coincide with the corresponding limit distributions in a random for- 
est and are described in Theorems 1.5.1 and 1.5.2; the limit distribution of the 
maximum size of trees in a random graph from A,,7 is given in Theorem 1.6.1. 

Now let n, T — o0 such that @ = 2T/n — A,0 < dX < 1. According to 
Theorem.1.4.3, under these conditions, 


2T fi —x 
As Seal TR + (0). (1.8.6) 


Ifn, T > o, 2T/n > 2,0 <d < 1, andm = o(n), then by Theorem 1.4.3, 


(n= m2) JT 
2T-m(T — m)! 


Fx. = 


Fr—-m.N = (1+ 0(1)). (1.8.7) 
Since 90 = 2T/n > 4,0 <A < 1, implies 2(T — m)/(n — m) < 9, there exists a 
constant c such that 


c(n — m)2(T—m) 


Fn—m,N < oT 
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In subsequent proofs, we will use a cumbersome technical estimate given in the 
following lemma. 


Lemma 1.8.1. Let n,T — © and let there be constants 49 and 41 such that 
0<Ao <6 =2T/n < A, < 1. Then 


1 mel m\2(T-m) 
= e2tm/n{,_ —)...f1—-—__ ene 
Cn,r(m) =e (1 7) (1 ; ) (1 =} 
x (1-2) (1-"=*) <1, (1.8.9) 
n n 


where mo < m <T and mo is sufficiently large. 


Proof. Write the logarithm of c,,7(m) as 


27m 
log ¢y,r(m) = — + Lotoa (1-5) +200 m)log (1 ~) 


=] i=1 k=2 

oe) Onn din foe) 1 m-1 : 
+ eh pearerey 

d k =) Do ik atl 


Using 


we obtain the estimate 


ee. ro) 
veo ECG) Dei (a) 
foe) (m — 1)*t1 (m — Wan 
~ Leer k(k+ k(k+1)T* een kk + kk + Dnk 


oo k+l 
m 2Tk 
= ———— | 2(k + 1) - — 


("#9") 
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To prove the assertion of the lemma, we note that for sufficiently large m, 


1 k+1 9k 1 k+1 
c= 2k +1) -0k- (1- =) a -(1-=) <0 


for all k. Indeed, since 0 < Ap < a < A, < 1, for sufficiently large m, 
1 k+1 ok 
(1-<) ok 22 k>1, 


and therefore 
ck < 2(k +1) — 2, 


which implies that c, < 0 for all k > 3 and sufficiently large m. In addition, 


3 3 
4 4 4 
a = 6-29 -(1-=) =-(1-2) oye) ee ——+-, 
m m m 


and cy < 0, cz < 0 for sufficiently large m, since for0 < Ap < 6 <A, < 1, 


2 4 
Soe eta aay a : 


Let bn ,i be the number of connected unicyclic graphs with n vertices that belong 
to A aa i = 1,2, 3. If this specification is of no significance, we write b, for the 
number of connected unicyclic graphs. Let a,,7(k) be the number of graphs in 
An,r with exactly k cycles. It is clear that 


pecs tl — fn . mbm, +++ Bm, ins 
St ee n—m,N 2 ile (1.8.10) 
m= mi+---+m,=m 


As in Section 1.7, let 


bax 
Bix) =). 
n=1 
(1.8.11) 
ee bn ix 
Bi(x) = 2 , i=1,2,3 
n=l 


and set 
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For such x, according to (1.7.1) and (1.7.2), 


B(x) = —jlog(1 — 6) +0 — 4(1—- (1 —8)) 
= —}log(1 — 6) + 40 + 46, 

Bo(x) = —4 log(1 — 6) — $(1- (1 - 4)?) 
= —4 log(1 — 6) — 56+ 467, 

B3(x) = —4 log(1—6) —6 + 4 (1 — (1 —9)’) 


—4 log(1 — 6) — 46 — }6?. 


Theorem 1.8.2. If n, T — 00 such that @ = 2T/n —~ 4,0 <A <1, then for 
any i = 1, 2,3 and any fixedk =0,1,..., 
n??. /T—iaAk 


an), (h) = Ware " + o(1)), 


where as (k) is the number of graphs in AS with exactly k cycles, and 


1 rn 

Aj = <i lS A) as 
1 rn 

= — lost = 2) 4: 
A2 5 loa( ) a i 
1 nn Mw 


Proof. We partition the first sum of (1.8.10) into two parts, S; and Sz. We set 
M = T'/4 and include in S$ the summands with m < M. For any x from the 
convergence domain of the series (1.8.11), the estimate 


bm x! - ++ by x k-1 binx”™ 
> Ym see! YE 0.81) 


m!---my! ! 
m>M m,+--+mp=m : k m>M/k 


holds. As in Section 1.7, let d, be the number of connected graphs of single-valued 
mappings of a set with n elements into itself and let 


foe) 


d(x) = > Get ; 


m! 


m=1 


Since 
m—l ro 
Bn S dm = (m — 1)! D1 F< (mm — Ite 
k=0 
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(see the proof of Lemma 1.7.5), the estimate 


dD lex)” 


m>M/k , m>M/k 


holds. Recall that we chose x = 6e~®. According to the hypothesis of the theorem, 
6 =2T/n > 4,0 <A < 1, and there existsg < 1 such thatex = 6e!-° <q <1 
beginning with some n. Therefore 


m 
: eas < ait (1.8.13) 
m>M/k | 


Taking into account estimates (1.8.8), (1.8.9), and Lemma 1.8.1, we find that 


1 n m! bm, ---bm, 
aoe (0 er 


! m,!---m x! 
m>M m)+----+my=m J k 


z —_> y n! (n = myPT-™) bn ears bm, 
— kt =, T- en avecs 
pea veer m)!24-™(T —m)!m,!---m,! 
1 m—1 
< o™{1——).--f1— 
Sgym le (t-5) 0-7) 
2 os my+-+m,=m 7 n 
m\2T-2m 1 m—1\ bin, +++ Bm 
s(1-— P=, josn [Po 
5S 3) ( 7) ( T rer 
bp «+b 
< 6e~%)™ m\ mk 
Seta. wd Oe ae 
2 ome bmg=m mh omy! 
cn? k-1 bmx™ 
< (Bay yO 
ort! 
k12/T! moM/k m! 
< con?! TVA Lk 


KatT!d—q) q , 

where cj, c2 are some constants. Thus, under the conditions of the theorem, 
S, = 0(n?" /(27T!)). 

We now estimate the sum S. According to (1.8.8), 


T!(n—m)2F-™) /T —7 


T! Fy-m,N = —3T-m(F omy + o(1)) 
2T ym /7 
2S ea) 
2 nm 


uniformly inm < M = TVA, 
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Therefore, for any fixedk = 1,2,..., 
: n be avabe 
S=7) ‘3S mt ( aii 

be m<M mi+--+mp=m m m,!---m ! 


aha (1 + 0(1)). 


| 
~~ 
=~ 
IS 
34 
» 
M 
M 
= 
ake 


m=k mi+---+mz=m 


Taking into account the estimate of Sp, we obtain 


5 . mevi=k 
1 RTT! 
a bm, X™! +--+ bm, x™* 
xf SO Ae 114001) +01) 
mak m,+---mp=m m!---mx! 
nT /TOd 
= k 
= Sra (BOG + 0). 
Combining the estimates of S; and Sz yields 


nt JT=n 
“RTT! 


under the hypothesis of the theorem. Since x = 9e~° — Ae~*, we also have 


Qn,7 (k) = WP Bayou) 


By(x) > Aj, By(x) > Ao, B3(x) > A3. 


| 
Theorem 1.8.3. Ifn, T — oo such that@ =2T/n > 1,0 <A <1, then 
(ee dais Ea 
an T need ort 7 i= , > > 
where Aj, i = 1,2, 3, as in Theorem 1.8.2. 
Proof. To obtain the asymptotics of a,,7, we have to estimate the sum 
oO 
Ont =) an,r(k). (1.8.14) 
k=0 


After normalization, we have 
-1 


wry! 0 7 427 
(sm) 4n,T = d “(r7) ant (k), (1.8.15) 


where for any fixed A = 0,1,..., 


nt \! Bene) JIA 
api) ar kl 
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asn, T + 00, 2T/n > 4,0 <A < 1. We can pass to the limit under the sum in 
(1.8.15) if the series converges uniformly with respect to the parameters n, T. To 
see this, it suffices to obtain an estimate 


por 
(sm) an T < Ax (1.8.16) 


such that the series Sure Ax converges. Using (1.8.8) and (1.8.9) and reasoning 
as we did in the proof of the estimate of Sp give 


I ’ eee 
an, T(k) = 7 > > on (") Fam 2m me 
“m=1 mytet+my=m m my!---mx! 
cn2t oe) x™ Bn a Dm 
a 


mak mj+--+mg=m 


IA 


cn27 (B(k))* 

2TT IK! 
Thus we have an estimate of the form (1.8.16) and can pass to the limit under the 
sum in (1.8.15) to obtain 


nt /T—h a 
ant = SrA OP LBGe *)}( + 0(1)). 
Depending on the set of graphs under consideration, replace B(x) with By(x), 
B(x), or B3(x), and Theorem 1.8.3 is proved. | 


A random graph from A,,7 has exactly N = n — T trees and a random number 
of unicyclic components. We denote by x the number of unicyclic components 
in a random graph from Aes i=1,2,3. 


Theorem 1.8.4. Ifn, T — oo such that 9 = 2T/n > 4,0 < 2d < 1, then for 
any i = 1, 2,3 and for any fixedk =0,1,..., 

ko-Ai 
Aje e 


p(x, =a} = SE 


(1+ o0(1)), 
where the Aj are as in Theorem 1.8.2. 


Proof. The assertions of the theorem follow from Theorems 1.8.2 and 1.8.3, since 
Plonr =k} =a, 7)/ay 7. 
a 


We be the number of vertices 


Now we consider the case 6 = 2T/n — 1. Leta, 
that lie in the unicyclic components of a random graph from Ay i= 1,2,3. 


It is clear that if we know the distribution of a characteristic of the random graph 
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© — m}, the unconditional distribution can be obtained 


is 


under the condition lo, 7 
by averaging over the detibaton of w, 


Theorem 1.8.5. Ifn, T — oo such that e = 1—2T/n — Oand €3n — oo, then 
for any i = 1,2, 3, 


y /4e-%(1 + o(1)) 


ali ras 


uniformly with respect tom such that y = ¢*m/2 lies in any fixed interval of the 
form 0 < yo < y < y1 < ©, and there exists a constant A such that, for all m, 


2 
Plant = m} < Ay /4e-), 


Proof. We denote the number of graphs in AD by a. and the number of graphs 
o @) 


for which @, ‘7 =m by a, ‘7 »- Clearly, 
an -> Cae (1.8.17) 
ke ay 
an Tm = (7) Fama (1.8.18) 


We decompose the sum in (1.8.17) into two parts. Let 0 < yo < yy < ©, 
y = e?m/2, and 


= (3) (i) 
St = oy Qn,T,m? = a m5 
m:yelyo,¥1] 


By Theorem 1.7.1 and the equalities (1.7.21), 
u®) = Aim™—"/4(1 4 0(1)) (1.8.19) 


uniformly in m in the region yo < y < yj, where A;, i = 1, 2,3, are defined in 
(1.7.22). There exists a constant c; such that, for all m, 


u® <¢eym™/4_ (1.8.20) 


To estimate F,_m,n, itis convenient to use the intermediate formula (1.4.25). From 
(1.4.26) and the equality 


62-6) =1-€*, 


we have 


-_ on poy 
Fi-m,N = CPi =n—m}, (1.8.21) 
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where, according to Theorem 1.4.1, 


1 2 
Pity =k} = ae “71 + 0(1)) 
$ oJ/2nN 
uniformly in k such that u = (k — Nyw)/(o VN) lies in any finite interval, 
2 n >  20-e) 
h= = 


—_—_ = —, of = ———.. 
2-0 WN e(1+e)? 
If ¢ > 0, e3n > 00, and m./e/n — 0, then fork =n — m, 


(k ~ Nw) m m/e 
Ae Ge eda 
Consequently, 
1 
P{tvy =n-—m}= 1 1)). 1.8.22 
{tv =n —m} 5 JoEN + o(1)) ( ) 


It follows from (1.8.21) and (1.8.22) that 


(n—m)!(1- 2)" eve 


Fy-m.N = m ,—m(l-e) 1 1)). 
‘N= “OWN ~ ey" Ban —ejve (1 + o(1)) 


(1.8.23) 
There exists a constant c such that 
ovV2n NP{ty =k} <c¢; 
therefore 
= 1 e2 n(1—e) 
ola = m)!(1 = 22)" VE = gyi mt-) (1.8.24) 


ONS aut = yal ian 
for all m,0 < m < T. We note that as e — 0, 
(1—e)"e" =e "(1+ o(1)) 
uniformly in m such that yo < y < yj, and for all m, 
(l—-«)"e" <e’. 


Clearly, (1.8.23) holds uniformly in m such that y = ¢?m/2 lies in the interval 
Lvo, 1]. Therefore, ifn — 00, e = 1 —27T/n — 0, and e?-n — oo, then 


Fy-m.N = fn(n —m)te7™ (1 + o(D) (1.8.25) 
uniformly in m such that yo < y < yi, where 


(1 = e2)Nenl-#) fg 


Jn = WNNI(L =e" Jinn 
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and there exists a constant Ag such that for all m, 
Fr—m,N < Aofn(n —m)teo™’. (1.8.26) 


Therefore, by (1.8.18), (1.8.19), and (1.8.25), we have the equality 


. m™—1/4 e—m-y 
Oy 7m = 2! Ai fs————(1 + 0(1)) (1.8.27) 


24r(i/4) 1 gig ove 
ve  Td/4) 


which holds uniformly in m such that yo < y < y1; and outside of this domain, 
by (1.8.18), (1.8.20), and (1.8.26), we have 


nlAi fh = + o(1)), 


@  —_An! fn yop 
nT.m = TPL /4)> 2’ 


where A is a constant. The sum 


3 I yee 
r(/4) 2 


m:yéelyo,¥1] 


a (1.8.28) 


is the integral sum of the function (I (1/4))~!z—3/4e-? with step e7/2. Therefore, 
by choosing yo small enough and y; and n large enough, this sum can be made 
arbitrarily close to 1, and the sum for remaining values of m can be made arbitrarily 
small. Thus 


wsArg ) 
ayy =H! Af (1+ 0(1)). 
Now it follows from (1.8.27) and (1.8.28) that 
rw Pe 
Plo, =m} = 2h" = 5 _y-3/4e-¥(4 + (1) 


oO 
a, ~ 2F(1/4) 


uniformly in m such that yo < y < yy and that outside this domain, 


2 
(i) Aé a 9) 
P{w 7 =m} sy Me-y, 


This completes the proof of the theorem. o 


When we substitute the exact expressions for A; and f,, we obtain fori = 
1, 2, 3, 


age tenes =(1 + (1), (1.8.29) 
= ee oO oo. 
4n,7 = 2NNI( —e)"J/2n 


3/4 Cy = ce "/4, and C3 = a3 It is easy to confirm that if 


where C) = e 
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¢€=1-2T/n = 0, then 
ni(1— e2)Nen(l-e) 2T 
2NNI(l—e)"J2an 27 T! 


Thus, under the conditions of Theorem 1.8.5, the asymptotic formulas 


(1+ 0(1)). 


2T 
in 


; C; 
ae Sarr (1+o0(1)), i=1,2,3, 


are valid. 

Let k,,7 denote the number of unicyclic components in a random graph from 
An,r and use B,,7 to denote the number of vertices in the maximal unicyclic 
component. 


Theorem 1.8.6. Ifn, T — oo such that ¢ = 1—2T/n > Oand e?n > ov, then 
for any fixed x, 


1 / 1 1 * 2 
P -] < pee | ®(x) = —— —u"/2 dy, 
{en + 5 oge <x 5 cee => P(x) aR pee u 


Proof. For any fixed x, 


1 / 1 
P fine + sage <3 —Froe| 
= s P{on,7 = m}P fe + 3 loge < nj} : 


m=0 


where x, is the number of components in a random graph from U,, discussed in 
Section 1.7. By Theorem 1.7.2, the random variable 


1 1 
(xn - jloem) / 3 egm 


is asymptotically normal with parameters (0, 1). 
Let y = e*m/2 and 0 < yo < y < yj < 00. Then logm = log(2y) — 2loge. 
Further, since ¢ > 0, 


1 1 
P {» aay loge < x5 ee| > P(x) 


uniformly in m such that y € [yo, y1] and does not depend on m asymptotically. In 
view of Theorem 1.8.5, by choosing yo small enough and y; and x large enough, 
the sum 


So Pont =m} 


m:yeLyo, yi] 
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can be made arbitrarily close to 1. Therefore 


1 | 1 
P fan + Sloge sx Frage} > 060) 


for any fixed x. a 
Consider now the maximum size of the unicyclic components. Recall that in 
Section 1.7 we introduced W;(z, y), setting Wo(z, y) = 1, and 


dx ---dxs 
Ws(z, y) = Ey TTR © 
X5(z,y) xX\°° “Xs5(z —Xpmee Xs) 


where 


X3(z,y) = (i >y, G=1,...,5, 1 +--+ +45 Sz}, 8 =1,2.... 


Theorem 1.8.7. Ifn, T — 00 such that ¢ = 1—2T/n > Oande?n — ov, then 
for any fixed y > 0, 


5 (1) 
P{e*Bnr <y}> Do ae es 
s=0 St 


where 
Zs(y) = 2a a y NeW, (1. +) dy, s=0,l,.... 
T'(1/4) Jo 2y 
Proof. For any fixed y > 0, 
oO 
P{e?Bn.r <v} = >) Plonr =m)P{e7Bm < y}, 


m=0 


where £,, is the maximum size of the components in a random graph from U,, 
studied in Section 1.7. If y = ¢*m/2 and y € [yo, yi], then 


P{e?Bn <y}= P |b < xm. 


By Theorem 1.7.3, 


y J) wens y 
P {Pn = * ml} = 4s! Ws (1. a) + o(1). 


s=0 


It is clear that this holds uniformly in m such that y € [yo, y;]. Choosing a small 
enough yo and a large enough yj and averaging over the distribution of w,,7 prove 
Theorem 1.8.7. a 
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The number of trees in any graph of A,r is N = n —T. Let m,,7 be the 
maximum size of trees in a random graph from A,r. 
Theorem 1.8.8. Ifn,T — oo such that e = 1—2T/n > Oand en > 00, then 


P{Bmm,r —u <z}> ee, 


where B = —log(@e~°), @ = 2T /n, and u is the root of the equation 
4\1/2 
(<) NB??? = w/e", (1.8.30) 
Proof. It is clear that 
fo.) 
P{Bin,r —u <2} = )> Plon.r =m}P{BIn—m,r—m — 4 <2}. (1.8.31) 
m=0 


Let v = en. It is easily seen that, under the conditions of Theorem 1.8.8, the root 
of equation (1.8.30) can be written as 


u = logu — 3 loglog uv — log 4./m + o(1). (1.8.32) 
Let y = e?m/2 lie in a finite interval 0 < yo < y < y; <0. Set 
2(T —m) 2(T —m) 
om = —————, én = 1; 
n—-m n—-m 
Um = edn, B(m) = —log@ne™). 


Since €,, = e(1 + o(1)), it follows from (1.8.32) that the root of the equation 


4\ 12 
(<) N(B(m))3/? = yo! 2e4 
cd 
can be written as 
Um = log Vm — 3 log log vm — log 4./m + o(1) = u + 0(1) 


uniformly in yin any fixed interval [ yo, yi]. Therefore, by applying Theorem 1.6.3, 
we obtain 


P{Bin—m,T—m —U <z} > ee (1.8.33) 


uniformly in y € [yo, 1]. In the main part of the sum in (1.8.31), this probabil- 
ity does not depend on m asymptotically. Therefore, averaging (1.8.33) over the 
distribution of w,,7 proves Theorem 1.8.8. a 


When we compare Theorems 1.8.7 and 1.8.8, we see that the maximum size 
of trees in a random graph from A,,7 is greater than the maximum size of the 
unicyclic components, since B = e7/2(1 + o(1)) and u > ov. Let a, 7 be the 
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maximum size of components of a random graph from A,,,7, that is, 
Qn,T = max(Bn,7T, 1n,T)- 


Averaging over the distribution of w, 7 gives the following theorem. 


Theorem 1.8.9. Ifn, T — oo such that e = 1—2T/n —> Oand e7n > 00, then 
for any fixed z, 


P{Ba,,7r —u<z}> et": 


where B = —log(@e~°), @ = 2T/n, and u is the root of the equation 
4\1/2 
(=) np? = w/e. 
bs 


To conclude this section, we consider the case where n, T — oo such that een 
tends to a constant. 


Theorem 1.8.10. If n,T —> 00 such that en'/3 + 2. 3-2/30, where ¢ = 
1 — 27 /n and v is a constant, then for any i = 1, 2, 3, 


(i) cn! e” 
a, 7 = ——= p(v)(1 + o(f)), 
nl — NNT NP ‘ 
where 
V3e3/4 V3e71/4 J3e73/4 


1 Ward/ay — Varayay OB A/A)’ 


ce. @) 
p(v) = if y 4 p(-v — y; 3/2, -1) dy, 
and p(u; 3/2, —1) is the density of the stable law defined by (1.4.18). 


Proof. We again use 


T 
n 
anr=>, ( ) in Fam (1.8.34) 
m 
m=0 
According to Theorem 1.7.1, as m — ov, 
Um = Am™—"/4(1 + o(1)), (1.8.35) 


where the value of the coefficient A depends on the type of the unicyclic compo- 
nents in A,,7, and 


ue V2 e3/4 _ JV2ne 1/4 = V2ne—3/4 
' 2Iard7ay) “? DIA dyay? “8 DTA /4)" 
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To estimate F,-m,n, we use formula (1.4.25) with 6 = 1. Then 
(n—m)! 
2NNte—tm 


where (vy = &| +--- +&y is a sum of independent random variables with distri- 
bution (1.4.19): 


Fi-m,N = P{ty =n —m}, (1.8.36) 


Wk-2 6-1 


P{é = 4} a 


k=1,2,.... 
By Theorem 1.4.2, 
bN7?P{ey =k} = p(u; 3/2, -1)(1 + o(1)) 


uniformly in & such that u = (k — 2N)/(bN 2/3) lies in any fixed finite interval. 
Under the conditions of Theorem 1.8.9, 


(n —2N)/(bN7/3) > -v. 


Let y = m/(bN?/3) and 0 < yo < y < 1 < 00. Then, under the conditions of 
the theorem, 

(n —m —2N) 
u= bn > -U- y- 
Thus, by (1.8.36), 


(n — m)! p(—v — y; 3/2, -1) 
INN! ennt+m b N2/3 
uniformly in m such that y € [yo, yi]. Since b = 2(2/3)?/?, from (1.8.35) and 

(1.8.37), we obtain 


n 
an,T,m = ( Ym Fam. 
m 


An\m™—"/4 p(—v — y; 3/2, -1) 
= ——aaYNie-tmpyze (1 toc) 


AnteN p(—v — y; 3/2, -1) 


KhAne (1 + 0(1)) (1.8.37) 


~ NN} 2m 3/46N23 mee 
Ante" J/3 5 1 
= lt eee 7p eae | ore 
— DNNI Jian Baier Dpyyaeen oe? 


uniformly in m such that y € [yo, yi]. To obtain a,,7, we need to carry out the 

summation in (1.8.35). If we choose a small enough yo and a large enough y1, 

substitute the expression of a@,,7,m into (1.8.34), note that the obtained sum is the 

integral sum of the function z~3/4 p(—v — y; 3/2, —1) with step b~'n-2/3, and 
omit the needed estimation of the tails, we have 

= cnie” 

an,T = aN NI JN F y 


[oe 


3/4 p(—v — y; 3/2, -1) dy(1 + 0(1)), 
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where 
_ AV 
c= 23/4. on Vin 
Recall our convention that if we consider the set AY, then A is replaced by A;, 
i= 1,2,3. a 


It follows from Theorem 1.8.10 that the number @,,7 of the vertices that form 
the unicyclic components in a random graph of A,,7 has the following limit 
distribution: 

If n, T — oo such that e = 1 — 27/n — O and e27n —> v, then 


1 

BN? Plon.7 =m} = or a a — y;3/2,-1)(1 + o()) 
uniformly in m such that y = m/(bN2/) lies in any fixed interval of the form 
0 < yo< y < ¥1 < coand p(v) is defined in Theorem 1.8.10. 


1.9. Notes and references 


In this book, we use a probabilistic approach to combinatorial problems. Section 1.1 
provides the results from probability theory that suffice for the probabilistic analy- 
sis presented in the book. All of the results in Section 1.1 can be found in standard 
treatments of probability theory; however, we follow [76], where these results are 
given along with full proofs. 

A detailed discussion of the saddle-point method can be found in [42]. The- 
orem 1.1.7 is a simplified version of the corresponding theorem that gives a full 
asymptotic expansion of G(A). 

The proof of the local limit theorem (Theorem 1.1.11) was suggested by 
B. V. Gnedenko and is contained in the book [49], which remains one of the 
best textbooks on the limit theorems of probability theory (see also [43, 122, 60]). 
The approximation of the binomial distribution by the normal and Poisson laws 
was investigated by Yu. V. Prokhorov [125] (see also [90]). The inequality from 
Theorem 1.1.16 was proposed by Hoeffding [59] for sums of bounded random 
variables (see also [122]). 

Section 1.2 is devoted to a description of the generalized scheme of allocation 
of particles, which is a generalization of the multinomial trials. It was introduced in 
[69] and now has a significant place in probabilistic combinatorics (see also [78]). 
Successful applications of the generalized scheme are mostly limited to the equi- 
probable cases; there are only a few examples where a nonequiprobable scheme 
has a natural combinatorial interpretation. Along with the nonequiprobable multi- 
nomial distribution, Example 1.2.3 is an example of a nonequiprobable scheme. 

Example 1.2.4 concerns random forests with rooted trees and is related to 
branching processes. Indeed, the distribution (1.2.11) is that of the total progeny 
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in the Galton—Watson process u(t, G), which begins with one particle that has 
Poisson-distributed numbers of offspring of a particle. Therefore a random forest 
with N trees and n nonroot vertices can be represented by the same process that 
begins with N particles under the condition that the total progeny is nm + N. We 
describe more precisely the correspondence between random trees and the branch- 
ing process y(t, G), whose distribution of the number of offspring of one particle 
is the Poisson distribution with parameter 2. 

Let z,(t, G) be the number of particles at time t having exactly r direct de- 
scendants, and let v(G) be the total progeny over the whole period of evolution of 
the process. 

Consider the set 7, of all rooted trees whose nonroot vertices are labeled 
1, 2,..., 2, and whose root is labeled by 0. Assigning the probability (n + 1)~"t! 
to each tree of 7; gives the uniform distribution on 7;,. 

Any vertex of a tree is joined to the root by a unique path, whose number of 
edges is called the height of the corresponding vertex. We assume that all the edges 
of a tree are directed from the root and call the number of edges emanating from 
a vertex the degree of the vertex. 

Let 44, (t, Tn), r,t = 0,1,...,, be the number of vertices of height t having 
degree r. Consider the matrices ||z,(t, T)|| and ||¢,(t, Gl], t,7 = 0,1,...,n, 
and a matrix M = ||m,(t)|| of the same dimension with nonnegative elements. 
Kolchin [73] showed that 


P{Ilur(t, Tr) || = M} = Plllur@¢, G)|| = M | v(G) =n + 1}. 


This relation means that the distribution of any random variable that can be ex- 
pressed in terms of the random variables j1,(t, Tn), r,t = 0,1,...,, coincides 
with the conditional distribution of the corresponding random characteristic of the 
branching process under the condition that v(G) =n + 1. 

This scheme has been used widely to obtain a complete description of the prop- 
erties of random trees and forests [73, 74, 75, 111, 112, 113, 114, 116]. Recently 
Yu. L. Pavlov [118, 119] discovered that the branching process that has a geo- 
metric distribution of the number of offsprings corresponds — in the same sense 
as discussed above — to a random plane planted tree with unlabeled vertices. This 
representation of random plane planted trees is also mentioned in [4, 136, 138]. 
Note that we are aware of only these two branching processes that have the Poisson 
and the geometric distributions of the number of offspring, which lead to sets of 
trees with uniform distribution. Results on more general classes of forests with 
nonuniform distributions can be found in [120, 121]. 

The correspondence between random plane planted trees and a branching pro- 
cess that has a geometric distribution appears to be deep and can be considered 
as a correspondence of realizations, that is, there exists a one-to-one correspon- 
dence between the set of such trees and the realizations of the corresponding 
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branching process. It seems that this fact was first pointed out in an explicit form 
by V. A. Vatutin [138]. 

The general approach to investigating connectivity and the sizes of components 
of random graphs of various types is presented in Section 1.3. This general ap- 
proach was first outlined by Kolchin [78], but its particular forms had already been 
used to investigate other random graphs, such as random permutations, random 
mappings, and random forests of rooted trees [71, 72, 73, 74, 75]. 

Forests of nonrooted trees are investigated in Sections 1.4-1.6. Section 1.4 
concerns the number of such forests. The number of forests of N labeled rooted 
trees with n nonroot vertices is N(N + n)"~!. In contrast to the forests of rooted 
trees, the number F;,,y of nonrooted forests cannot be expressed by a simple for- 
mula. A complete analysis of the random forests of nonrooted trees was conducted 
by V. E. Britikov, who used the generalized scheme of allocation. The possibil- 
ity of using such an approach was pointed out in [78, 77]. When Britikov began 
investigating F;, 7, it was known only that for any fixed N asn — oo, 

n—2 


Fn,N 7 +o(1)). (1.9.1) 


— n 

~ 2N-1(N — 1) 
A complete description of the asymptotic behavior of F;,, xv can be found in [29]. 
In particular, formula (1.9.1) is generalized for N — oo and proves that ifn — oo 
and (1 — 27/n)?n — —oo, then 


nt? a 


The cases in which (1 — 2T/n)Pn tends toa constant and (1 — 2T/n)pn — ooare 
covered by Theorems 1.4.4 and 1.4.3, respectively. 

Section 1.5 deals with the numbers j1, of trees with r vertices, r = 3, 4,..., in 
a random forest. A complete description of the limit distributions of these random 
variables was obtained by Britikov [30]. Theorems 1.5.1 and 1.5.2 summarize the 
results proved in [30], where, in addition, the behavior of 441 and j22 is analyzed. 

The general approach used to investigate the order statistics in the generalized 
scheme was suggested in [70] and is also described in Lemma 1.2.2 in [78]. In Sec- 
tion 1.6, we apply this approach to the maximum size of trees in random unrooted 
forests. The results of this section were obtained by Britikov [28]. Theorems 1.6.1— 
1.6.5 cover all possible regular variations of the parameters n and N, but not the 
case where N is bounded. Clearly, for any fixed k, the size of the kth largest tree of 
the forest can be analyzed in the same way. Luczak and Pittel [101] realized this 
posibility and interpreted the results of their analysis as an evolution of a random 
forest (see also [31]). 

It is pertinent to note here the results that concern the investigations of the 
ordered series of components of wide classes of random graphs [4, 7, 14, 15, 35, 
36, 41, 56]. There are two natural ways of labeling the components. One way is to 
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arrange them in decreasing order; the other is to use a particular random labeling 
called the size-biased permutation. For the first type of labeling, let, > Mz >--- 
be the sequence of sizes of the components of a graph with n vertices numbered 
in decreasing order. Let C, be the size of the component that contains the vertex 
with label 1, let C2 be the size of the component that contains the vertex with the 
smallest label among the vertices not included in the first component, and so on. 

It is clear that the joint distribution of the random variables C), C2, ... nor- 
malized by n places unit mass on the set A of infinite sequences of nonnegative 
numbers such that 


A = {(x1, X2,...),X1 #22 +--- = IJ, 


and the joint distribution of M@,, M2, ... normalized by n is concentrated on the 
set 


V = {(%1, x2,...)}€ A, x1 > x2 > ++}. 


For some classes of graphs, the limit distributions of the sequences C), C2,... 
and M,, M2, ... are known. Let us describe a class of the limit distributions. 

Let Z,, Z2,... be independent identically distributed random variables with 
density 


ai—z)*!, O<z<1, 6>0. 
Let 


Y= Zi, Y2 = Z2(1— Z}), Y3 = Z3(1 — Z1)(1 — Z2),... 


and let Y(1), Y(2), .. . be the order statistics constructed from Yj, Y2, .... The dis- 
tribution of Y), Y2,...on A is called the GEM distribution with parameter 0, and 
the distribution of Y(1), Y(2),... on V is called the Poisson—Dirichlet distribution 
with parameter 0. 

Itis known that the distribution of the random variables M4, Mo, .. . normalized 
by n for the cycle sizes of a random permutation of degree m converges, asm —> 00, 
to the Poisson—Dirichlet distribution with parameter 6 = 1 and that the random 
variable C is uniformly distributed on the set {1, ..., 2} (see, for example [78]). 
For random mappings, the distributions of the random variables C), C2, ... and 
M,, M2, ... normalized by n converge, respectively, to the GEM distribution and 
the Poisson—Dirichlet distribution with parameter 6 = 1/2 [3]. 

As usual, let a, denote the number of components of size r of a random graph 


with n vertices. The joint distribution of the random variables a, ..., a, of the 
form 
@+n—1\! gaitetan 
Plat = ay. -..y = dn) = ( ) De Bae 


where a), ..., @ are nonnegative integers such that a) + 2a2 +---+na, =n is 
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similar to the joint distribution of the random variables a;,...,a@, for a random 
permutation (see Lemma 1.3.7). This distribution arises frequently in population 
genetics and is known as the Ewens distribution [40, 67]. 

If the random variables C;, C2,... and Mj, M2,... correspond to a graph 
with the Ewens distribution of a1, ...,a@, with parameter 6, then as n — oo, 
the distributions of the normalized random variables converge, respectively, to the 
GEM distribution and the Poisson—Dirichlet distribution with the same parameter 0 
[67]. See also [139, 140, 141]. 

Section 1.7 contains the results on unicyclic random graphs obtained in [77]. The 
analysis of random graphs with components of two types presented in Section 1.8 
is also contained in [77]. The idea of considering a graph as a combination of 
connected components of certain types can be attributed to Agadzhanyan [1, 2]. 
The results of Section 1.8 can be found in [77]. 


2 


Evolution of random graphs 


2.1. Subcritical graphs 


This chapter deals with several models of random graphs with n labeled vertices 
and T edges as n, T — oo. The parameter 9 = 2T/n plays a decisive role in the 
behavior of random graphs, and it may be interpreted as time in the evolution of the 
graphs. It turns out that many of the characteristics change their behavior abruptly 
near the point 9 = 1. It is convenient to distinguish three domains of the variation 
of the parameter 0. We say that a random graph is subcritical ifn, T —> oo in such 
a way that (1 — 0)?” —> oo. Thus, for a subcritical graph, @ may tend to unity, but 
not too fast. A critical graph is characterized by the conditions that n, T — oo and 
(1 — 0)3n tends to a constant. And, finally, a graph is supercritical if n, T > 00 
and (1 — 6)? + —oo. 

In this section we consider three sets of graphs. Let gy, be the set of all graphs 
with 1 labeled vertices and T edges with loops and multiple edges, provided each 
vertex may have no more than one loop and each pair of vertices may be connected 
by no more than two edges. Let eg be the set of all graphs with 1 labeled vertices 
and T edges that have no loops; however, each edge may occur twice, so that each 
pair of vertices may be connected by no more than two edges. And, finally, let 
ge). be the set of all graphs with n labeled vertices and T edges that have neither 
loops nor multiple edges. 

Denote the number of graphs in es by Boa i = 1, 2,3. We introduce the uni- 
form distribution on Gey: i = 1, 2, 3, assigning equal probabilities to all elements 
of the corresponding set, and denote by ea a random graph such that 


: aS aa 
P(G,.7 = G} = (gy7) 
for any G € G7, i = 1,2, 3. 
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Recall that in Section 1.8 we considered the sets A®,., i = 1, 2, 3, of all graphs 
with n labeled vertices and T edges with components of two types: trees and 
unicyclic components. In Am, the unicyclic components have neither loops nor 


multiple edges; in A a the unicyclic components have no loops, but may contain 


cycles of length 2; and in A the unicyclic components may contain loops and 


cycles of length 2. Thus, 
ALS Ors CSTs 
The results of Section 1.8 allow us to describe the limit distributions of various 
characteristics of subcritical random graphs Gy, i=1,2,3. 
Theorem 2.1.1. If n,T —> oo such that (1 — 2T/n)Pn — ©, then for any 
i= 1,2, 3, 
PLA € AY} 1 
Proof. It is clear that 
PIG T Any} = ay 7/87 


We need to determine the asymptotics of ee i = 1, 2, 3, under the conditions of 


Theorem 2.1.1 to match the results on a, from Section 1.8. 


Recall that if 9 = 2T/n > A,0 < A < 1, then by Theorems 1.8.1, 1.8.2, and 
assertion (1.8.29), 


(@) _ & (A)n?? 


nT = Tp (t+ OCD) (2.1.1) 


for any i = 1, 2, 3, where 


Cl (A) = ehl24h?/4 2(A) _ e242 /4 c3(A) = en A/2-0?/4- 


If n, T > oo and T3/n* > 0, then 


GB) _ ta = ue) 
8n,T T 


_ @@— bt 1-2 (1-5). (1-2) 
~ Ort ( n(n — 1) n(n — 1) n(n — 1) 


nt e-Tin-T? jr? 


aT 


(1 + o(1)), (2.1.2) 


and Theorem 2.1.1 is proved for i = 3. 
It is clear that each graph from gr can be obtained by a choice of T edges, 
which is equivalent to an allocation of T particles into (5) cells, provided each cell 
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contains no more than two particles. Therefore 
2) = AY S —t ] 
ht2n=7 \' 7 


where S = (3): t; cells have exactly one particle, and fz cells have two particles. 
Hence, 


8,7 = ty!t2!(S — ty — t2)! 
t+2to2=T 1+f2+ 1 a) 
_ il s T'S! 
~ Tt “T — = : 
T! oger2 2'S—T +1)! 
For any fixed ¢, 
T!S! 


= TA ST-te-TPIRD (1 + 9(1)) 


2reN 
= sf (=) e+ o(1)). 


Therefore, under the conditions of Theorem 2.1.1, 


T ,-T?/n? © 2 

Q S‘e 2T 

ier ae Dale ) (1 +0(1)) 
t= 


(T —2t)!(S—T +1)! 


net e-T/n-T? 0 


= arg oe "(1 + o(1)) 


wT rns Tiyn 
read /n+T"/"" (1 + o(1)). (2.1.3) 


Similarly, each graph from ig can be obtained by a choice of T edges, which is 
equivalent to an allocation of T particles into n + 6 ) cells, provided that no more 
than two particles are allocated into each of (¢) cells and only one particle may be 
put into each of 7 cells. Therefore, putting S = (3) yields 


a) _. n S\ {[S—t 
Br= do (") (")( 6 
ty tte+24=T 


By the same arguments under the conditions of Theorem 2.1.1, 


2T 
n 
Br = are tr" (1 + o(1)). (2.1.4) 


Then, by comparing (2.1.1) to (2.1.2), (2.1.3), and (2.1.4), we obtain the assertion 
of the theorem. | 
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According to Theorem 2.1.1, each of the subcritical graphs G® pi = 1,2,3, 
consists of trees and unicyclic components and, with probability tending to 1, does 
not contain more complicated components. 

Given a random graph G, denote by z,(G) the number of trees of size r, by 
n(G) the maximum size of trees, by w(G) the total number of vertices in the 
unicyclic components, by x%(G) the number of unicyclic components, by B(G) the 
maximum size of the unicyclic components, and by a#(G) the maximum size of 
the components. 

Let y(Gr) be a characteristic of the random graph GO ry and let aa be the 


corresponding characteristic of the random graph from Aon, Then, by the formula 
of total probability, 


Ply(Gur) s*} = PIG. An TIP lyr S*} 
+ Plat, ¢ AP, IPI ( 03) <1 0 ¢ AP 
for any x. By Theorem 2.1.1, 
PIGnr Ayr} > 1 


if the graph Gr is subcritical. Therefore, for any characteristic y(Gr) of the 
subcritical Sah 
P{y (GO) <x} = Ply. <x} + o(1)) + 0(1), (2.1.5) 


and if Ply < < x} tends to a limit, then the probability P{y(G"r) < x} has the 
same limit. Thus, many of the results of Section 1.8 can be reformulated for the 
corresponding characteristics of the random graphs Ge: i= 1, 2,3. If v(Gir) 
is an integer-valued characteristic, then for any fixed integer k, 


P{y (Gor) =k} = Ply, =k} + 0) + 0), (2.1.6) 


and if Piy) = = k} has a nonzero limit, then relation (2.1.6) allows us to obtain 
the limit of the probability P{y (GO) = = k}. 


Theorem 2.1.2. Ifn, T — oo such that T/n — 0, then for any i = 1, 2, 3, 
P{w(G,) = 0} > 1. 


Ifn,T — oo such that ¢ = 1 —2T/n — Oand e3n — 0, then for any fixed 
x > Oand anyi = 1, 2, 3, 


1 x 
(i) \.2 —3/4 ,-y 
P{w(G,'7)e /2<x}> maa ff y Me dy. 


Proof. The assertions of the theorem follow from (2.1.5), (2.1.6), and Theo- 
rems 1.8.1 and 1.8.5. | 


2.1 Subcritical graphs 95 


Theorem 2.1.3. Ifthe graph Gori is subcritical, i = 1,2,3,andr =r(n,T) > 3 
varies such that N p,;(6) — 00, then for any fixed x, 


@)_ 
p | Hr(Gn.r) NO 5528 ih 2 dy, 
Orr (0) VN V2 J—co 


where 
N=n-T, 
6 = 2T/n, 
2k *-29k-! o—kO 
0) = —————_,, =1,2,..., 
Pr() ki —6) k 
pr(O) — pr(O) — (uu — k)* pr (0) 
Orr (0) = 
oO 
_ 2 
w= 2-6) — 6)’ 
i ee 
(1 — @)(2— 6)?" 


Ifr =r(n, T) > 3 varies such that Np;(@) > 4, 0 < A < 00, then for any fixed 
k=0,1,..., 


ake 
P{u-(Gi't) = b= + ou. 


Proof. In view of (2.1.5) and (2.1.6), the assertion of the theorem follows from 
Theorems 1.5.1 and 1.5.2 because, by Theorem 2.1.2, the number w(Ger) of 
vertices in the unicyclic components for subcritical graphs is small compared with 
the total number of vertices; more precisely, P{w(G"r) < n2/3} > 1, au 


Theorem 2.1.4. Ifn,T — oo such that T/n > 0,r = r(n,T) > 1 and 
Npr-(0) > 06, Npy+1(8) > 4,0 <A < 00, then for any i = 1, 2, 3, 


P{a(G) =r} = P{n(G") =r} =e + (1), 


Pla(Giir) =r +1} = P{n(Gyr) =r t tp =1—e* +000). 


Proof. In view of (2.1.5) and (2.1.6), the assertions of the theorem follow from 
Theorem 1.6.1. | 


Theorem 2.1.5. If i = 1,2,3 and n,T — ow such that 0 = 2T/n > A, 
0 <A <1, then for any fixedk =0,1,..., 
: Ake-Ai 
1) 
P{x(G,"7) = k} = 7 


(1 + o(1)), 
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where 

1 rn Ww 
Ay = —slogd—-A+5+7 7, 
A2 = Lg (1—A) sea 
ame Ta aia 

1 n 2 
Ag = =—los(1 +7) = = 
3 5 lost ) ae 


For any fixedk =0,+1,..., 
P{a(Gy'r) — [a] < k} = P{n(Gt,) — [a] < k}(. + 0(1)) 


5/2 
= 1 = 10g)? +a a—1-10g,) 


at ja +o(1)), 


= exp| - 


where 
ae logn — (5/2) log logn 
6 —1-—1ogé 
[a] and {a} are, respectively, the integer and fractional parts of a. 


’ 


Proof. The assertions of the theorem follow from (2.1.5), (2.1.6), and Theo- 
rems 1.8.4 and 1.6.2. | 


Theorem 2.1.6. Ifi =1,2,3 andn,T — o such that e = 1—2T/n — Oand 
e°n —> 00, then for any fixed x, 


P {-(c2h) + 3 Bs <xn[-ji0ee} > ss fe PP ae 


and for any fixed x > 0, 


‘ © (-1)5 
P{e7A(G{"r) <x} = > SZ. +040), 
s=0 , 


where Z;(x) is defined in Theorem 1.8.7. Finally, for any fixed z, 
P{Ba(G?) —u <z} = P{6n(G,) —u <z}( +00) =e “(1+ 0(1)), 


where B = —log(6e~°), 9 = 2T/n, and u is the root of the equation 


(=) NB? = u5/2e". (2.1.7) 
‘a 


Proof. The results of the theorem are the consequences of (2.1.5), (2.1.6), and 
Theorems 1.8.6, 1.8.7, 1.8.8, and 1.8.9. | | 
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2.2. Critical graphs 

Recall that a graph with n vertices and T edges is called critical ifn, T — oo such 
that = 1 — 27/n > 0 and en tends to a constant. We have seen that many of 
the characteristics of the random graphs ot, i = 1,2, 3, change their behavior 
if 9 = 2T/n approaches the value 1. For example, the number of cycles, or the 
number of unicyclic components x(G",), tends to zero in probability if @ > 0, 
has the Poisson distribution with parameter A;,i = 1, 2, 3, respectively, if@ — A, 
0 <i < 1, where 


es Laat Hata 
Le a 4 
ee Pat x5 hg Me 
2= 79 08 a 
Ag L og(t ay —2 » 
3 = 2 g 5) 4” 


and is asymptotically normal with parameters (-} log ¢, —} loge) ife — 0, 
é°n — oo. Thus, 6 = 1 is a singular point and one can correctly suppose that 
the behavior of the graphs near this point is interesting but difficult to investigate. 
Indeed, not much is known about the properties of critical graphs. We present here 
only one assertion about this behavior. 

Recall that Aor is the set of graphs with n labeled vertices and T edges that 
consists of trees and unicyclic components with neither loops nor multiple edges 
for i = 3, without loops and with cycles of length 2 allowed for i = 2, and with 


cycles of lengths 1 and 2 allowed for i = 1. 


Theorem 2.2.1. Ifn, T — 00 such that en!/3 _5 2.3-2/3y, where v is a con- 
stant, then for any random graph Gy, i= 1, 2,3, 


V3n 


43/27 1 1 
raja” p(v)(1 + o0(1)), 


P{Gi, € Any} = 
where 


Loe) 
p(v) = i y 3/4 p(—v — y; 3/2, -l) dy 


and p(y; 3/2, —1) is the density of the stable law, introduced in Theorem 1.4.2, 
with the characteristic function 


f(t) =exp{ — |tP/2ei/41), 


Proof. It is clear that 


P(Gur © Ayr} = an r/&.7> 
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where ay ‘) is the number of graphs in As and g), 


go ie 7% 2, 3. In accordance with Theorem 1.8.10, 


is the number of graphs in 


@ cinte" 
an T= WIN Pe + OW), 


where N =n —T, 
S3e!4 J3e71/4 J3e3/4 


= ———,_ 2Q = >... ss '7/1 = =. 
2Jmr(/4)) 22/4)’ | 2/2 (1/4) 

In the previous section, we proved that 
@ _ 2 ci(1) 
Sat ~ OFTT 

where c} (1) = e?/4, e(1) = e747, €3(1) = e734. 
Since T = n(1 — €)/2 and e7n — 8v?/9, we easily find 
nie"T127 

2N N1/Nn2t 


(1 + o(1)), 


= 2/me"7(1 + o(1)) 
and, consequently, 


a, = Vv 30 
gi, V27(1/4) 


e*”/27 niyy(1 + o(1)). 


The function p(v) can be represented by aconvergent power series. The function 
a7 
ao) = pv) = [4p - 91372, -Day 


can be thought of as the convolution of the function 


ae y> 0, 


gi(y) = 
0, y<0 
and the function g2(y) = p(y; 3/2, —1), so that 
foe) 
g(v) = il gi(y)go(v — y)dy. 
—00 


Therefore the Fourier transform 9(¢) of the function g(v) is the product of the 
Fourier transforms of the functions g;(y) and g2(y). The Fourier transform 2; (t) 
of the function gj (y) is 


2neint/BltD 


8O* Prey 
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and the Fourier transform 2(t) of the function go(y) = p(y; 3/2, —1) is the 
characteristic function of this density: 


Bo(t) = exp { — [e[?/ei/40D}, 


Qn eitt/Bltl) 
21 (3/4)|¢|1/4 


By the inversion formula, 


a(t) = exp { = [2[3/2eeme/(4EDY 


1%, 
ge) = = J e' B(t) dt 
—-C 


i] 8 nity) )—1/4,int/BIt\) 3/2 gint/(Alel) 
cde ety |—1/4e exp { — |t|?/7e dt, 
J2P (3/4) J—oo | 
and therefore, under the hypotheses of Theorem 2.2.1, 
; F 30 
PIG? 2 AO Vi 40119); 
{ n,T nh J2P (1/4)/2P (3/4) 


where 


h(v) = i eft? 1p 1/4 gitt/Ble)) exp { ~~ [t[?/2efmt/ Ale) dt. 
—0o 


Since P(1/4)P'(3/4) = /2z, we obtain 


i i J3 
Pic"), € Avr} = ae + o(1)). (2.2.1) 


The function h(v) can be represented by a convergent power series. 


Theorem 2.2.2. Ifn,T — oo such that en'/3 5 2.3-2/3y, where v is a con- 
stant, then for any random graph Ge i = 1, 2, 3, 


P{GO, € AO} = P(v)(1 + 0(1)), 


2 4u3 /27 as u* 2k 1] rk 
= /— Sor (= +=) cos. 
EOE a a \3 se ae 


Proof. Let us represent h(v) by a power series in v. Since the left-hand side of 
(2.2.1) is real, 


where 


h(v) = nf ef! 1g —1/4 gixt/Ble) exp { ra [t7/2eht/AleD) dt. 
—0O 
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Consider first the integral 
oO 
hiv) = [ tad mas ad exp { — pr gi dt. 
0 


By expanding e’’”, we obtain 


Ai(y) = eit/8 — Guyt pe k-1/4 _ 3/2,in/4 
i(v) = ei DT ft exp { — PP ei*/4} de. 
! Jo 


After the change of variables 12/2e'7/4 


2— vk ink 2k 1 
MOm sD ROR | a ha) 
=0 


= z, we obtain 


Therefore 
22 vk k_ (2k 1 
Bhi (v) = 5 ae (F rs 5) (2.2.2) 
Similarly, for 


° ; . 
ho(v) = / eft || V/4e-i*2 exp { = jt[2/ef MAD) ay 
—oo 


[oe] 
=| et WA ein B exp { — 13/2 e-In/4) ge, 


0 
we obtain 
2—u'  owk_ (2k 1 
=x — —T{—+-}. 2.2.3 
Htha(v) 5 st (5 +5) ( ) 
The assertion of the theorem follows from (2.2.1), (2.2.2), and (2.2.3). | 


Theorem 2.2.2 allows us to calculate the limit values of Pig”, € AG}. For 
example, 


P(0) = /2/3. 


Some values of P(v) are given in Table 2.1. 


2.3. Random graphs with independent edges 


When we were determining the number of graphs in the classes Gor. i=1,2,3,in 
Section 2.1, we associated each of the classes with the corresponding equiprobable 
scheme of allocating particles into cells. It is easily seen from these correspon- 
dences that the realizations of each of the random graphs Go i = 1,2, 3, could 
be obtained by a sequential allocation of particles, but these random allocations 
are dependent. For example, if a pair of vertices has been connected in the random 


2.3 Random graphs with independent edges 101 


Table 2.1. Values of P(v) 


v P(v) v P(v) v P(v) 


—3.0 0.0053 —1.0 0.4919 1.2 0.9563 
—2.8 0.0118 —0.8 0.5727 14 0.9653 
—2.6 0.0239 —0.6 0.6470 1.6 0.9722 
—2.4 0.0443 —0.4 0.7128 1.8 0.9776 
—2.2 0.0755 —0.2 0.7693 2.0 0.9819 
—2.0 0.1196 0.2 0.8551 2.2 0.9852 
—1.8 0.1768 0.4 0.8860 2.4 0.9878 
—16 0.2461 0.6 0.9105 2.6 0.9899 
—14 0.3244 0.8 0.9297 2.8 0.9915 


—1.2 0.4078 1.0 0.9447 3.0 0.9929 


graph Gy after allocating some of the edges, then the outcomes of all subsequent 
allocations cannot be the edges connecting these two vertices. 

The classes of random graphs whose edges are independent seem to be easier 
to investigate by using the methods of probability theory. The best-known random 
graph with this property is Gp,» with n vertices such that each of the (5) possible 
edges belongs to the edge set of G,,, with probability p independently of the 
behavior of the other edges. This graph has a random number of edges with the 
binomial distribution with n trials and the probability of success p. 

In this section, we consider the random graph G,,7 with n vertices labeled 
1,...,” and T edges that can be obtained by T independent trials. In each trial, 
the loop at any point i occurs with probability n-? and the edge connecting 
the vertices i and j,i # j, occurs with probability 2n-*. In other words, if 
the edge set of G,,r consists of T edges ((i(1), j(1)),..., @(T), j(T)), then 
i(1), j(),..., (7), j(T) are independent identically distributed random vari- 
ables taking the values 1, 2, ..., with equal probabilities. It is clear that the re- 
alizations of the random graph G,,,7 are not equiprobable. For example, for = 2 
and T = 1, the graphs with a loop and an isolated vertex have the probabilities 1/4 
each, and the connected graph has the probability 1/2. Nevertheless, this model 
has some advantages and is conducive to treatment by probabilistic methods. 

Since i(1), (1), ...,i(T), j(7) are independent identically distributed ran- 
dom variables, we can associate to the random graph G,,7 the classical scheme 
of allocating particles where 27 particles are allocated into n cells such that each 
particle falls into any of n cells with probability 1/n independently of the allo- 
cations of the other particles. By using this relationship, we can, for example, 
easily find the distribution of the number of loops in G,,7. Indeed, we have T 
trials, corresponding to T edges, and in each of these trials a loop appears with 
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probability 1/n. Thus, the total number of loops a in G,,7 has the binomial dis- 
tribution with parameters (7, 1/n). The mean number of loops is Ea; = 7 /n. If 
2T/n — A,0 <A < oo, then the Poisson distribution with parameter 1/2 is the 
limit distribution for a1. 

Under the condition a; = m, the other edges may be considered as the result 
of T — m independent allocations into (5) cells corresponding to (5) possible 
edges of the complete graph with n vertices. Therefore, with a; = m, the number 
a2 of cycles of length 2 in G,,7 can be thought of as the number of cells with 
exactly two particles in the classical (equiprobable) scheme of allocation of T —m 
particles into (5) cells. The classical scheme of allocation has been well studied. In 
particular, ifn, T — oo such that 2T/n —> A, 0 < d < ©, then the distribution 
of the number of cells, occupied by exactly two particles each, converges to the 
Poisson distribution with parameter 47/4. Since the limit distribution does not 
depend on m for m = o(n), averaging over the distribution of a, shows that a 
and a2 are asymptotically independent and their distributions approach the Poisson 
distributions. 


Theorem 2.3.1. Ifn,T — oo such that 2T/n > 4,0 < i < ©, then for any 
fixed nonnegative integers k, and ko, 


at (22\2 2 
P{a) = ki,a2 =k} = (5) (=) e*/2-M/4(1 + 0(1)). 

Because the edges of G,,7 are independent, we can apply direct probabilistic 
approaches to investigations of the structure of G,,r. 


Theorem 2.3.2. Ifn, T — o© such that T/n — 0, then in G,,7, with probability 
tending to 1, there are no cycles and all the components are trees. 


Proof. Denote the number of cycles of length r with r distinct vertices by a, and 
let v(G,,7) = a +--+ + ay be the total number of cycles considered as induced 
subgraphs of G,,,7. We can represent a, as a sum of indicators. The edges of G,,7 
appear sequentially in T trials. We assign the numbers 1,2,..., 7 to the trials 
and arrange (in some order) all (7) possible subsets of cardinality r of the trial 
numbers. We define the random variable &; to be equal to 1 if the subset of trial 
numbers labeled with i forms a cycle in G,, 7, and &; = 0 otherwise. It is clear that 


Or =F t--- +51. 


In turn, each of the random variables &1, ..., 7, can be represented as a sum of 
indicators. The cycle corresponding to the subset with label i can be constructed 
from r different vertices and r different edges. There exist (”) possibilities to 
choose these r vertices and (r — 1)!/2 possibilities to construct a cycle from these 
r vertices forr > 3. Each construction fixes r edges that must occur. These r edges 
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can occur at r fixed places of the subset labeled 7, and there exist r! possibilities 
to assign these r edges to r places. Thus the event {€; = 1} can be realized by one 
of the (”)(r — 1)!r!/2 variants. 

For r > 3, each of these variants has the probability (2/n2)’. Thus, 


_(T\(r\e— Dirt (2 
eC). aan 


It is not difficult to check that this formula is also valid for r = 1 and r = 2. 
It follows from (2.3.1) that 


T'n'(r—1)tr! (2\" 2T\" 1 
ba IS Shey 
eS rir!2 (=) ( n ) 2r 


Therefore, 


n 
Ev(G,,7) = )) Ea, 


r=1 


has the upper bound 


0° r 
2T 1 
Ev(G,,r) < > (=) = 
a 2r 


Under the conditions of the theorem, Ev(G,,r) tends to zero and the number of 
cycles in G, 7 is zero with probability approaching 1. a 


We denote by A,r the set of all graphs with n labeled vertices and T edges 
whose components are trees and unicyclic components. Note that loops and cycles 
of length 2 are permitted. As before, 9 = 27/n,¢ = 1—2T/n. 


Theorem 2.3.3. Ifn, T — 00 such that e?n — 00, then 


P({G,,7 ¢ Ant} < =-- 
en 


Proof. We have to prove that under the conditions of the theorem, the graph G,, 7 
has no component with more than one cycle with probability less than 4/(e?n). If 
in G,,r there exists such a component, then in G,,,7 there either exists a subgraph 
that consists of two cycles connected by a chain (pince-nez) or there exist two 
cycles that have a common sequence of edges (a cycle with a bridge). We use aw 
to denote the number of subgraphs of G, 7 that consist of cycles of lengths r and 
s connected by a chain of t edges, and denote by EM the number of subgraphs of 
G,,r that consist of a cycle of length r with two vertices connected by a sequence 
of t edges. To prove the assertion of the theorem, it is sufficient to show that the 
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mean number of such subgraphs tends to zero. It is clear that 


P{Gnr ¢ Ant} = 2p - ae > o <E (See + rx}. 


r,s,t r,s,t 


By reasoning in the same way as in the proof of formula (2.3.1), we obtain the 
estimates 


Es < (, oe :) (“Te —Di(r—-Dit—D! 


T 2 r+t or (2T r+t 
pile \ eee) 
ae) Ola) 2am) 


1) 
Eg) < n +stt—- DY vg _ay 
br Oye rist(¢—D! no) 


T 2 r+s+t 2 oT r+s+t 
8 er <-[— : 
«(pra Je tstot(S) < ?(2) 


Thus, the mathematical expectation of the total number of pince-nez and cycles 
with a bridge can be estimated as follows: 


> Eg) + > Eg) 


rt=0 r,s,t=0 
(t) r+t (t) rts+t 
= yor(=) +2 > (4) gas 3" 
fb n n ere n(1 —27/n) 


Theorem 2.3.4. If n,T — oo such that9 = 2T/n > 2,0 <A < 1, then 
the distribution of the number of cycles v(Gy,7r) in Gn,r converges to the Poisson 
distribution with parameter 


1 
A = —~ log(1 —A). 
7 lost ) 


Proof. In view of Theorems 2.3.1 and 2.3.3, we can reduce the proof to the 
application of Theorem 2.1.5 concerning the random graph Go) without loops 
and multiple edges. Indeed, by the formula of total probability, 


P{v(Gnr) =k} = > Pla =k, a2 = ha, Gar € Ant} 
ki +ko<k 
x P{v(Gar) =k | oy = ki, a2 =k, Gar € An} 
So Plo = ki, a2 = ke, Grr ¢ Ant} 
ki thy <k 
x P{v(Gar) =k | ay = ky, 2 = ko, Gn.r ¢ Any}. 
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According to Theorem 2.3.3, P{G,,7r ¢ An,r} — 0, and it is not difficult to see 
that 
P{G,,7 € Alo =k, a2 =k} = P{Ge Te yk) © An}, 
P{v(Ghr) =k | a = ky, «2 = ko, Gnr,r € An,r} 
a = P{x(G nS b) = k — ky ~ ko}. 
Thus 


P{v(Gar) =k} = > Play =k, a2 = ko} (2.3.2) 
ki +ko<k 


x P{x(GOl_,, 4) =k — ka — ko} (1 + 0(1)) + 0(). 
According to Theorem 2.1.5, under the conditions of Theorem 2.3.4, for any 
fixed kj, kz =0,1,...,andk > kj + ko, 


3 As ki ky e7A3 
P{x(G°)) =k-ki-k} > 


(k—ky —ko)V 
where 
: 2 
A3= —5 lost “Nas >: 


Now it follows from (2.3.2) and Theorem 2.3.1 that 


k k 
yee a2 (MY" 0-1/4 
ky! \2 ko! \ 4 


P{v(Gn,7) = k} 


ky +k <k 
Nee ky—kp 
X<G-pum + o(1)) + o(1) 
eA y kt 
AL Oe kata! (k = ke — be)! 
FO a ee 
x (5) (=) AS 81 + 0) + 0(1) 
A* 
= ao + o(1)), 
where 
a a 1 
he Nota tp oat a): 


By reasoning in the same way, we can reformulate the theorems proved for 
Ge so that they can also be applied to subcritical and critical graphs G,,,7. As an 
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example, we give an analogue of Theorem 2.1.6 on the number x(G,,,7) and on the 
maximum sizes n(G,,r), B(Gy,7), and a(G,,7) of trees, unicyclic components, 
and all components in G,, 7, respectively. 


Theorem 2.3.5. Ifn,T — oo such that ¢ = 1—2T/n > Oand e3n > oo, then 
for any fixed x, 


1 1 1 x 2 
P G, =1 < J-s! > — —W'/2 dy. 
{» nT) +5 oge <x 5 ce| safe du 


for any fixed x > 0, 


oO (_14)s 
P(2AGr) <x} = 


s=l 


where Z,(x) is defined in Theorem 1.8.7; and 


Zs(x)(1 + 0(1)), 


P{Ba(Gy,r) — u < z} = P{Bn(Gy,r) — u < z}(1 + o(1)) =e7* (1 +0(1)), 


where B = —log(@e~*), 6 = 2T /n, and u is the root of the equation 


2\ 12 
(=) (n = T)p°!? _ yr! 2e. 
cd 
For the same reasons, Theorem 2.2.2 can be extended to the critical graph G,,,7. 


Theorem 2.3.6. Ifn,T — 0 such that en'/3 + 2.3-2/3y, where v is a con- 
stant, then 


=f Rap ef te 1 ak 
P{G, 7 € Ant} = ze at z +5) cos +00). 


For the supercritical case where n, T — 00 such that e3n — —oo, we present 
here only the simplest results. In the final section of this chapter, we will give a 
short review of what is known about the supercritical graphs. 

It is known that if 9 = 2T/n — i, > 1, a giant component appears in the 
graph Go) and, with probability tending to 1, Go). consists of trees, unicyclic 
components, and this giant component formed by all the vertices that are not 
contained in trees and unicyclic components. As 27/n increases, the size of the 
giant component increases and the number of unicyclic components decreases. 

If 9 = 2T/n — 4,1 < A < &, then the number of unicyclic components has 
a Poisson distribution. For 9 — 00, we have the following result. 


Theorem 2.3.7. Ifn, T — oo such that 0 = 2T/n — 00, then with probability 
tending to 1, there are no unicyclic components in Gyr. 
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Proof. The number of unicyclic component with r vertices is not greater than 
cr’—1/2, where c is a constant (see, e.g., [16]). Denote by xr (Gn,7) the number of 
unicyclic components of size r in G,,7. By reasoning as in the proof of (2.3.1), 
we find that 


F T-r 
Ex,(Gn T)<c e - aie a ee 1- a) =i) = rl) —) , 
: r}/\r n2 ne my) 


(233) 


where the last factor is the probability that the T — r edges, which were not used 
for the construction of unicyclic components, neither connect the vertices in the 
component with the vertices outside the component nor connect any pair of vertices 
in the component. 

It is sufficient to prove that 


do Ex-(Gr.r) > 0. 


l<r<n 


With the help of estimate (2.3.3), we find that 


.y Ex,(Gn,r) < ys (Bey e~2—CFD/2T=r)/n? 


l<r<n I<r<n 
For sufficiently large n and 1 <r <n, 


er (n—(r+1)/2)(T 1) /n? < e779/4 


1-9/4 — 1. Therefore 


foe) 
D ExGuns Oa =p 
r=1 


l<r<n 


and q = 6e 


Since q = 0e!—9/4 _, 0. as @ — ov, we conclude that a unicyclic component 


exists in G,,7 with a probability that tends to zero. a 


Finally, we consider the behavior of the random graph G,,,7 near the point where 
the graph becomes connected. Denote the number of components in G,,7 by %n,7. 


Theorem 2.3.8. Ifn — oo and2T =nlogn+xn + o(n), where x is a con- 
stant, then with probability tending to 1, the graph consists of a giant connected 
component and isolated vertices. Also, for any fixed integerk =0,1,..., 


—kx 


é = 
P{xn,7 —1=hk} > 7 e 
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Proof. We have to prove that, with probability tending to 1, G, 7 consists of one 
giant component and isolated vertices, and that the distribution of the number of 
these isolated vertices converges to the Poisson distribution with parameter e~~. 

The edges of G,,7 appear as a result of T independent trials, and these T 
trials can be considered as the allocation of 27 particles into n cells such that any 
particle is allocated independently of the other and, with equal probabilities, falls 
into any of n cells. Therefore the number of isolated vertices in G, 7 has the same 
distribution as the number 49(27, n) of empty cells in the well-studied classical 
scheme of allocating particles. Under the conditions of the theorem, the distribution 
of 449(2T, n) converges to the Poisson distribution with parameter e~*. 

To complete the proof, it suffices to show that, with probability tending to 1, 
the remaining vertices form one giant component. If, in addition to the isolated 
vertices, there were two other components, then the graph would contain a tree of 
size r,2 <r < n/2, such that any vertex of the tree would not be connected to 
any vertices outside the tree. A skeleton of one of the two components could play 
the role of such a tree. 

By &, we denote the number of trees of size r which are the skeletons of 
connected components of G,7. We will show that under the conditions of the 


theorem, 
» Eé, > 0, 


2<r<n/2 


and consequently, with probability tending to 1, such a tree does not occur in G,, 7. 
We can represent &, as a sum of indicators and find that 


r-1 _ T-r+l 
Eé, = ( s \()eoe (=) (1 = or) g 34) 
r-i/\r n? n? 


This formula is similar to (2.3.1): We choose r vertices and 7 — 1 edges that form 
the tree, and the last factor is the probability that none of the T — r + 1 edges that 
remain connects a vertex from the set of r selected vertices with a vertex from the 
set of nm — r remaining vertices. 

By using formula (2.3.4), we can check, for example, that with probability 
tending to 1, there are no isolated edges in G,, 7. Indeed, for r = 2, 


Z T-1 
Ef) <2T (: = 57) < 2Te4-DT-D/n* (2.3.5) 
n 
and the right-hand side of (2.3.5) tends to zero if nm — oo and 2T = nlogn + 
xn+o(n). 
It follows from (2.3.4) that 


-1 —2 
Eé, < vn 7 a ener rT rt) /n? 
~ r—Dirtnee— 
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and for all sufficiently large n, 


Therefore 
n= 
ae 1—49/9)r 
x Eé, < oT a (Ge ) 
3<r<n/2 r=3 
n?(be!-49/9)? 
2T (1 — de!-49/9) ° 
Ifn > oo and2T =nlogn + xn + o(n), then 


Qe!—46/9 —4logn/9+1—4x/9+o0(1) 


= 2 logne 


and for all sufficiently large n, 


clogn 
Ge!—40/9 < 
£ = 74/9 
where c is a constant. 
Therefore, under the conditions of the theorem, 


yo Eg +0. 


3<r<n/2 


Taking into account that E& — 0 also, we see that, with probability tending to 1, 
the graph G,,7 has only one component besides the isolated vertices. a 


2.4. Nonequiprobable graphs 


The model of the random graph G,,,7 considered in the previous section can be 
easily extended to nonequiprobable graphs. However, the approach based on the 
generalized scheme of allocation, which reduces the investigations of equiprobable 
graphs to some problems concerning sums of independent random variables, does 
not apply to nonequiprobable graphs. In this case, few results have been obtained 
because of the lack of effective methods to investigate these objects. 

In this section, we consider a generalization of the random graph G,,7 of the 
previous section. We preserve the notation G,,7 for this nonequiprobable graph 
with n vertices labeled with the numbers 1, 2,...,” and T edges, which can be 
obtained by the following procedure. We consider T independent trials, in each of 
which one edge is drawn. The edge connects two different vertices or forms a loop; 
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the vertices with labels i and 7 are connected with the probability 2 p; p;, and the 
loop at vertex i is formed with the probability BS i, j=1,...,n,pi,---, Pn = 9, 
Pit-::++ py = 1. Thus, after T trials we have a realization of the random graph 
Gn,r, Which may have loops and multiple edges. 

The main result of this section is the following assertion. 


Theorem 2.4.1. Assume that pj; = a;/n, where a; = aj(n),0 < € <a; < E, 
iz1,...,n,¢ and E are constants, and the limit 


1 n 

a* = lim — y a? 
n>oo n 4 i 
i= 


exists. 

Then, ifn, T —> 00 such that 2T/n — 4, 0 < da < 1, the distribution of the 
number of cycles v(Gy,r) in the graph G,,7 converges to the Poisson distribution 
with parameter A = -4 In(1 — Aa?). 


In proving the theorem, the limit distribution of the random variable a,, the 
number of cycles of length 7, and the joint limit distribution of @,,,..., a, are 
obtained. 


Theorem 2.4.2. Under the conditions of Theorem 2.4.1, without the requirement 
xa < 1, the distribution of the random variable a, for any fixed r tends to the 
Poisson distribution with parameter h, = ” a? /(2r). 


Theorem 2.4.3. Under the conditions of Theorem 2.4.1, without the requirement 
da? <1, the joint distribution Of Op,,..+, Or, for any fixed 1 <r) < +++ < TPs 
converges to the distribution of s independent random variables that have the 
Poisson distributions with parameters h,,,..., Ar,, respectively. 


The proof will be accomplished by the method of moments. 

A cycle of length 7 has no self-intersections if it is composed of 7 vertices 
and exactly r edges of G,,7. Denote by a, the number of cycles without self- 
intersections of length r, r > 3, in the random graph G,,,7. For r distinct vertices 
iy,...,i,, let &,,i, = 1 if in G,,r there exists a cycle composed of these r 
vertices containing exactly r edges of G,,7; in other cases, we set &},,_i = 0. 
Then 


> eee (2.4.1) 


where the summation is taken over all (7) distinct unordered sets of r distinct 
indices. In the complete graph with vertices ij,...,i-, there exist (r — 1)!/2 
distinct cycles containing exactly r edges. We label these cycles in an arbitrary 
order with the numbers j = 1, ..., (7 — 1)!/2 and represent the random variable 
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&;,,...,i, a8 the sum of indicators: 


gees: 


(r=1)'/2 
f= EO aa (2.4.2) 
j=l 
where Can : ;, = lif the jth cycle exists in G,,7, and gv ) ;, = 0 otherwise. 


pares 


We now investigate the behavior of the random variable | 
v(Ga,r) = 01 +++ +n, 


where the variables a, are defined by (2.4.1) for r > 3, a is the number of loops, 
and a2 is the number of pairs of parallel edges in G,,r. 

Each cycle in the graph G,,,7 may be thought of as the set of edges that form this 
cycle; therefore, the following assertion is needed for evaluating such probabilities 
c= 1}. Let V, = {(i), j1),..., (ir, jr)} be the set of r distinct pairs 
of vertices in the graph G, 7, where i, # jp, k = 1,...,r. Denote by P(V,) the 
probability of the event that all the edges from V, occur in G,,r. 


Lemma 2.4.1. Ifn,T > 00, 2T/n > 4,0 <A’ < ~,0<€ <aj < E < ©, 
i=1,...,n, then for arbitrary fixed ¢, E, andr, 


PU) = aay «01,4 (1 +0 (;)) (2.4.3) 


uniformly with respect to a,,..., a, and all sets V,. 
Moreover, for any 5 > 0, there exists a constant c such that, for all r and n, 


P;) <ett® 


Qj, 4j, +++ @j,Qj,. (2.4.4) 


Proof. Set q; = 2pi, p;,,k = 1,...,7r. Then 


plmit—+mr] 


PU) Se Doo ae ae 
m4,...,.Mp>1 1 
T—m,—---—m, 
(= gis = gr) 
= T'"\q, a(a-a- vege 


(T — rylmit torr] vi ed 


m 
+ +3 Wikre = 1 “Or 
x (l= gis corres). (2.4.5) 


Here x!"J = x(x — 1)--- (x —m +1); the summation in YY is taken over all sets 
{m,,...,m,} in which m,...,m, > 1 and there exists i, 1 < i < r, such that 
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m; > 1. It is clear that 


G=giaheegy "Sk 
and for an arbitrary fixed r, 
(l—q1 —-:-—q)'~" =1+ O(1/n). (2.4.6) 
In addition, 
(T- pylmitetmr—r] - = fe ea, 
> 7?" ee Me | rnd 7 m\ mr 
m,!---my! 
; 
< \oalT -n5S, (2.4.7) 
j=1 
where 


(T —r — 1ylitetm—r-1 


a, oe 


Loewe ! 
M1 ,...,Mp21 my): mr: 
m;>| 
mi—1 my—1 
a a T—mj—-— 
—— ge ee a my) my 
I 


Let]; = m; — 2,1; =m; — 1, j #i (recall that m; > 1). Then 


(T—-r- pitt] 


= ara vlte 
too AFD GED! G+! 


Upsasey 
x qi Sie gir (l -—q-c a 


(T-r- platter] 


pe ee 


h,...,J,20 


xg .--gh(—gi— ++ — grt 1. (2.4.8) 


Now assertion (2.4.3) follows from (2.4.5)—(2.4.8), and assertion (2.4.4) from 
(2.4.5), (2.4.7), and (2.4.8), since 


A 


r 2 
2TrE 
: q(T —r) = 2 ’ 
i=] . 


(2T)" 
TUNG) -- gy poe eee So Sue he 


IA 
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Corollary 2.4.1. Ifn,T — 00, 2T/n + 1,0<1<0,0<e<a<E<~, 
i=1,...,n, then for arbitrary fixed ¢, E,, andr, 


P{E, = l}= = Pap ots i 2 (1 + O (-)) 
r nv 1 Gi, n 


uniformly with respectto j, 1 < j < (r—1)!/2, all sets {i,,...,i,}anday,..., an. 
Moreover, for any & > 0, there exists a constant c such that, for allr and n, 


A+)" 
aol ES NE 


Proof. The equality &”, ) = Lholdsifand only if in G, 7 there exist r fixed edges, 
{ki ji), ---, kr, Jy}; ih # jy, v =1,...,7, which form the jth cycle on the 
vertices it, ...,7,. For these edges, the sets 4 ,..- kr} and {j1,..., j-} coincide 
with the set {i1,...,i-}. Therefore, the corollary follows from Lemma 2.4.1. 

a 


The notation {i;,..., i} denotes an unordered set of distinct indices i, ..., i;; 
the number of such sets is (”). For ordered sets of distinct indices i},...,i-, we 
will use the notation (i1,..., i-); the number of such sets is n!”!, By the symbols 


Dee a 
{i1,..5ir} (1-557) 


we will denote the summations over all distinct unordered and ordered sets of 
r distinct indices, respectively. It is clear that the summation over all unordered 
sets {i1,..., i} is well suited to summands fj,...;, whose values are invariant with 
respect to the permutations of indices. For such summands, 


aa De en, a Sin is (2.4.9) 


serslp) EL ge ees 


FO {O_O = Die hice (2.4.10) 


(i i) (i@ i) (A ss Jrk) 


sereedp  foce\ly seers 


if the left-hand side summation is taken over all distinct ordered sets of distinct 
r-dimensional indices i(”, ) ieee i®, 


Lemma 2.4.2. If 0 < ¢ <a; < E <w,i =1,...,n, then for any fixed r, 


asn > W, 
n Ls 1 
(7) = a} ---a? (1+0(¢)). (2.4.11) 
= 


i1,..5ér) 
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Proof. The following representation is valid: 


n 
= 2 2s o> 2 2 * 2 2 
(S07) = = os i eg >: G0 a, + »; sf ea: Pa 


where the summation in the first sum is taken over all distinct ordered sets of 
distinct indices, and in the asterisked sum, over all distinct ordered sets, each have 
at least two identical indices. The number of summands in the first sum is n!”!; the 
number of summands in the second sum is equal to n” — n!"! and does not exceed 
cyn’—! where the constant c, depends only on r. Therefore 


* 
az -- ‘ap > nlrle?r x a; “sa? <c,n"’ |B?" , 
(i ,..5%r) (i},.--s8r) 


and the proof is complete. | 


Corollary 2.4.2. Under the conditions of Theorem 2.4.2, for any fixed r > 3, 
ar az" 


Ea, > 


Moreover, for any 5 > 0, there exists a constant c such that 


(A + 6)" (a2 +8)" 


Ea, <c 
2r 


Proof. Using representations (2.4.1) and (2.4.2), with the aid of (2.4.9), Corol- 
lary 2.4.1, and Lemma 2.4.2, we obtain 


eee 1 
Ea, = p> qj az (1+0(-)) 


solr} 


(1a Ss 1 
= ent NEON 
r,2r 


eke _ Ma 


The second assertion follows immediately from the inequality of Corollary 2.4.1. 
| 


(1 + 0(1)). 


We now evaluate the factorial moments of a@,. If S, = & +---+&,, where 
&,..., &, take the values 0 and 1 only, then according to Theorem 1.1.4, 


Sn(Sn — 1) +++(Sp —m +1) = pay iy ++ Ely (2.4.12) 


where the summation is taken over all distinct ordered sets of m distinct indices. 
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In our case, the indices have a composite structure because 


(r-1)!/2 


ge Xu ee Tp 


Peter 


The following representation is analogous to (2.4.12): 
a, (ay — 1)-+-(@ —m +1) = Ls. ” ae Sn (2.4.13) 


where the summation is taken over all distinct ordered sets 


(f(s. APJ, Adee (HPs 2}, Jn)) 


of distinct indices of the form ({i}, ...,i,}, 7); the set {i,,..., i-} in the index is 
considered an unordered set of distinct indices, and 7 indicates the number of the 
cycle formed by the vertices ij, ..., i,. 

We show that under the conditions of Theorem 2.4.2, for any fixed r and any 


fixed m > 1, 
arger\™ 
Ea" > ( > ) (2.4.14) 
r 


This assertion for m = 1 follows from Corollary 2.4.2. 
In order to become accustomed to the more complicated notation, we first 
consider the case m = 2. By (2.4.13), 


Eq!?] = > PLE ADs fa i) ae i}. 
(({ il? Ja), (22, Ja) ae 


Decompose the right-hand side sum into two sums. Let the first sum 2% include 
the summands with nonintersecting sets ae Ste i} and {i ~, Sd i), When 
we take into account that in this case 2r edges snitst exist to sara 

Ci) G2) 
50), Oe =. 40 = 1, 


and by using Lemma 2.4.1, we obtain 


PLE o 80.0 =] 
AV" 4 ee 
=(5) ae ae ip (1+0(5)). 
Therefore 


AY (ir — DPV? 
sl (ay 
aera Po grerr cf 


x (1+ O(/n)). 


116 Evolution of random graphs 


It is clear by virtue of (2.4.9) and (2.4.10) that 


r 


2 2 2 2 1 2 2 
a ---@Q a --@ =—_ —— ay ---at. 
i 0450) i? = Gye : y it iy 
7] 


Therefore, by virtue of Lemma 2.4.2, 
2r 2 
CHIN 71S rar 
21 =( 5 rp? dea} +00) =(=—) A+o(). 
‘ i=] 


(2.4.15) 


We now show that the remaining sum 22 tends to zero. 

The summation in X2 is taken over the pairs of composite indices in which the 
sets Nas er i) and Ge. Beta i?) have at least one common element. Each 
composite index ({i,,...,i-}, 7) corresponds to a cycle in the complete graph 
with n vertices; the cycle consists of r edges and the vertices i),...,i-. Two 
cycles corresponding to the indices it?, oe iM), Ji) and gE, betes i?)}, hb) 
can have M < 2r distinct vertices and L distinct edges. We decompose the sum 
22 into the sums Zy,z containing summands with fixed values of the parameters 
M and L. The number of such sums does not exceed (27)*; therefore it is sufficient 
to prove that any sum 2 yz tends to zero. It is easy to see that in the case M < 2r, 
the inequality L > M + 1 is valid. The number of summands in the sum Xyy,7 
does not exceed n™, and the probability that L fixed edges appear in G,,7 does 
not exceed, by virtue of (2.4.4), the value cn~“. This implies 


Ze. (2.4.16) 
n 
Therefore, as n — ov, 


ee I) (2.4.17) 


The assertion (2.4.14) for m = 2 follows from (2.4.15) and (2.4.17). 
Now let us consider the factorial moment of an arbitrary order m. By (2.4.13), 


Eo!) — 2) + 22, 


where the sum % includes only summands that do not have a pair of sets from 
fi,” Pe a) eee i”, ..., i} with common elements. In this case, rm edges 
must occur in the graph G, 7 to guarantee that the corresponding random variables 
equa! 1. From this and Lemma 2.4.1, it follows that 


P| (A) =1,..., Gin) =1] 
5,00... 0 rales 


x mr 
2 2 2 2 
= — Ay A ay 8 Amy °° Mm + 00), 
n i i, i i; 
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and, by (2.4.9) and Lemma 2.4.2, 


r,2r\™ 
zi = ( ; ) (1+ 0(1). (2.4.18) 


It remains to prove that the sum 2 taken over the remaining sets of indices 
tends to zero. The summation in 22 is taken over m sets of composite indices that 
have at least one common element in at least one pair of the sets {i (P) Py, 
{i @ ate iy, Pp # q. Recall that each composite index se ea to a cycle 
in the complete graph with n vertices. The cycles corresponding to m indices can 
contain M distinct vertices and L distinct edges. We decompose the sum 2 into 
the sums Ly,z containing summands with fixed values of the parameters M and 
L. The number of such sums does not exceed (rm)?; therefore it is sufficient to 
prove that any sum X47, tends to zero. Itis clear thatif M <rm,thenL > M+1. 
Thus, since the number of summands in the sum X47, does not exceed n™ and by 
(2.4.3) the probability of L fixed edges occurring in G,,7 does not exceed cn, 


c : c 
nl-M — 4 


=M,L < 


Therefore, as n — 00, 
x2 > 0. (2.4.19) 


The assertion (2.4.14) follows from (2.4.18) and (2.4.19). 

By (2.4.14), the limit distribution for a@,, r > 3, is the Poisson distribution 
with parameter A, = Na /(2r). Itis easy to see that in the current situation the 
number of loops a; and the number of pairs of parallel edges a2 approach the 
Poisson distributions with parameters Ay = Aa?/2 and Az = A2a*/4, respectively. 

This proves Theorem 2.4.2. 

The more general Theorem 2.4.3 can be proved analogously. It is sufficient to 
verify that under the conditions of the theorem, 


Eo... ols] =y x ++ Qs 


for arbitrary fixed integers m1, ..., ms, where 


rar 
a 2r - 
By (2.4.13), 
fg] gli), - a) 
ome 
where 


ies (eer fe) a a 


sty ..,mM, kK=1,...,8, 
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are unordered sets of r, vertices, and #”, 1=1,...,mz,k = 1,...,5, are the 
numbers of cycles of length r; under the labeling chosen. 
Therefore 


Eoim] oe - offs] 


-rels W) iy lm) A) 5) a i}, 


5/00 a) 5) 


where 


P= (CAP. ADs I) BD) (CBP AP), «(ED HD) 


We decompose the sum on the right-hand side of this representation into two 
parts; let the sum 21 include only summands with the distinct elements in all 

( ,f=1,...,mg,k =1,..., 8; and let the sum 22 include all the remaining 
summands. For the summands of the first sum, the corresponding random variables 
equal 1 only if there exist 71 +- - -+msrs fixed edges in Gy,7. Therefore, by Lem- 
ma 2.4.1, 


p {el =1,...,6) = i} 


a) (s) 
I Ins 


x Mri +--+MsPs ‘ ‘ ‘ 
= — May °° @a,y *** @ans.s) °° A omg,s) (1 +o(1)), 
n a try 4 rs 


and, by (2.4.9), (2.4.10), and Lemma 2.4.2, 


yn az" m\ Ars q2ts ms 
31 = = (1 +0(1)). 
2ri 2Krs 


It remains to prove that X2 tends to zero. The summation in 22 is taken over sets of 
composite indices in which at least one of the elements 1, 2, ..., is encountered 
at least twice. A cycle corresponds to each of the composite indices. The existence 
of a common element in the cycles implies that the number M of distinct vertices 
contained in the cycles and the number L of distinct edges involved in the cycles 
satisfy L > M +1. We decompose the sum 2 into a finite number of sums Ly, 
containing summands with fixed values of the parameters M and L. By virtue 
of (2.4.3), for each of these sums, the estimate 


c c 
2M, < <- 
; nlL-M~ y 


holds because the number of summands does not exceed n™ , and the probability of 
L fixed edges occurring in G, 7 does notexceed cn“. This proves Theorem 2.4.3. 
To prove Theorem 2.4.1, we need the following auxiliary assertion. 
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Lemma 2.4.3. Let ee, ie 7 be nonnegative integer-valued random variables 
such that for an arbitrary fixed s and arbitrary nonnegative integers ky, ..., ks, 
all... gk 
(2) _ (1) _ jaa ee nee 
P{é; =ky,...,& =k) > Fk s 
as n —> 00, where aj, a2, ... is a fixed sequence of nonnegative numbers. More- 
over, suppose 
EE +--- +6) +0 (2.4.20) 
as § —> oO, uniformly inn, and let 
[oa) 
So ax =A<OW. 
k=1 


Then the distribution of the random variable ce” -_ rg peer 6”) converges 
to the Poisson distribution with parameter A. 
Proof. We show that for an arbitrary fixed ¢ > 0 and an arbitrary fixed m, 


Ame-A 
m! 


<€é 


Pts =m} - 


for sufficiently large n. For fixed ¢ and m, there exists s such that 


Ame As AM enA 


m! m! 


€ 
< sr) 
~ 3 
where A, = a, +--+ +a. 
It is not hard to see that 
[P{g =m} — P(e =m} < Ple™, +--- +6 > 0}. 


Therefore, by (2.4.20), |P{¢” = m} — P{¢” = m}| < e/3 for sufficiently large 
s. Finally, the conditions of the lemma yield the convergence of the distribution of 

) Ee +.--+&” (for any fixed s) to the Poisson distribution with parameter 
As = a, +---+ as. Therefore 


for sufficiently large s. a 


Theorem 2.4.1 follows from Theorem 2.4.3 and Lemma 2.4.3, whose conditions 
are satisfied when Aa” < 1. 
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2.5. Notes and references 


The investigation of the evolution of random graphs began when P. Erdés and 
A. Rényi published the results of their study [37] in 1960. Along with the basic 
properties of the random graph Gor, they discovered the effect known as a phase 
transition. At about the same time, V. E. Stepanov studied the graph G,,p, as 
documented later [133, 134, 135]. Until recently, Stepanov’s results had not seemed 
to receive wide recognition. In particular, Stepanov proved that if p = c/n, where 
c is a constant, c > 1, then the size of the giant component is asymptotically 
normal with mean na(c) and variance nB(c), where 


1—y/c 
a(e)= 1-2, pce) = 2, 
and y < 1 is the root of the equation 
ye Y =ce™. 
A similar assertion for the graph Gee was proved by B. Pittel [123] about twenty 


years later. He found that the size of the giant component of Gor is asymptotically 


normal with parameters na(c) and nB(c)(1 — 2y + 2y?/c) asn,T — oo and 
2T/n>c>1. 

Many open questions concerning the evolution of random graphs remain. The 
main goal of this chapter is to demonstrate the approach based on the generalized 
scheme of allocation in investigations of the evolution of random graphs. Sec- 
tion 2.1 shows that fine properties of subcritical graphs can be obtained in a rather 
simple and natural way, especially as concerns the behavior of subcritical graphs 
near the critical point. The transition phenomena for the graph Ger were first 
considered by B. Bollobas [20]. The results presented in Section 2.1 can be found 
in [77]. The approach based on the generalized scheme of allocation allowed us to 
prove asymptotic normality of the number of unicyclic components and find the 
limit distribution of the maximum sizes of trees and unicyclic components. 

Section 2.2 is devoted to critical graphs. The behavior of random graphs near 
the critical point, and especially in the critical domain where the giant component 
appears, is very complicated and difficult to investigate. The investigations of the 
behavior are far from complete, but even now the results obtained could fill another 
book. Much information about random graphs can be found in the fundamental 
work by Bollobas [21] and in the book [105], which is devoted to the evolution 
of random graphs. A detailed investigation of the birth of the giant component 
is given in [63]. Supercritical graphs are considered by Luczak [99], who, in 
particular, proved that the right-hand bound of the critical domain is determined 
by the conditions n, T > o, (1 — 2T/n)3n > -0Oo. 

Formally, to analyze supercritical random graphs, we can use the representation 
of almost all such graphs as a combination of components of three types: one giant 
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component, trees, and unicyclic components. However, this approach is hampered 
by the absence of a simple formula for the number of connected graphs with n 
vertices and T edges with k = T —n > 0. Note that k = T — n is equal to the 
number of independent cycles in the graph and is called the cyclomatic number 
of the graph. Denote by c(n, &) the number of connected graphs with n labeled 
vertices and a cyclomatic number k. It is clear that c(n, — 1) is the number of trees, 
and by the Cayley formula, c(n, —1) = n"—2, whereas c(n, 0) is the number wu, of 
unicyclic graphs considered in Section 1.7. The numbers c(n, &) were investigated 
by Stepanov (see [10, 142, 143]) and E. M. Wright [151, 152] and are known as 
the Stepanov—Wright numbers (see [143]). Asm — oo and ke /n— 0, 

k/2 
an) / n'+&-D2 4 o(1)), 
where, as it was proved by Meertens, d = 1/(27) (see Bender, Canfield, and 
McKay [16]). 

We hope that the results of the study by Bender et al. [17], who give the asymp- 
totics of c(n, k) for all regular variations of the parameters n and k, can be used in 
the application of the generalized scheme to random graphs and help to bring the 
investigations of supercritical graphs to the level attained for the subcritical case 
in Section 2.1. Note that obtaining the limit distributions of numerical character- 
istics Of supercritical graphs would be merely a problem of averaging if the joint 
distribution of the size of the giant component and the number of its edges were 
known. 

The parameter 6 = 27/n plays the role of time in the evolution of random 
graphs. Therefore, each numerical characteristic of a random graph can be con- 
sidered not only as a random variable, but also as a random process with the time 
parameter 6. Of significant interest is the approach using the convergence of such 
processes. This approach is used in the recent papers [34, 62, 127]. Note that the 
investigations of convergence of such random processes in combinatorial problems 
were started by B. A. Sevastyanov [132] and Yu. V. Bolotnikov [22, 23, 24]. 

The random graph G,,,7 discussed in Section 2.3 was investigated by Kolchin 
[79, 83]. This graph provides an appropriate model of the graph corresponding 
to the left-hand side of a system of random congruences modulo 2 considered in 
the next chapter. An analogy of Theorem 2.3.8 for bipartite graphs was proved by 
Saltykov [131]. 

The nonequiprobable version of the graph G,,,7 is considered in Section 2.4, 
where the results of the papers [88, 66, 65] are presented. Here we use the method of 
moments. The lack of regular methods for an asymptotic analysis of nonequiprob- 
able graphs makes it impossible to carry out anything approaching a complete 
investigation of such graphs. It seems to us that developing the methods appropri- 
ate for the analysis of nonequiprobable combinatorial structures is a problem of 
great importance. 


c(n, k) = dn)? ( 


3 


Systems of random linear equations 
in GF(2) 


3.1. Rank of a matrix and critical sets 


In this section, we consider systems of linear equations in GF(2), the field with 
elements 0 and 1. Let us begin with two examples where such systems appear. 

Consider first a simple classification problem. Suppose we have a set of n objects 
of two sorts, for example, of two different weights. We may sequentially sample 
pairs of the objects from the set at random, compare the weights of the objects 
from the chosen pair, and determine whether the weights are identical or different. 
The problem is to identify the objects that have the same weight — actually, to 
estimate the probability of finding that solution. For a formal description of the 
situation, let {1, 2,...,} be the set of objects under consideration and let x; be 
the unknown type of the object 7, 7 = 1,...,. We may assume that x1, ..., X» 
take the values 0 and 1, depending on the class to which the object belongs. We 
choose a pair of objects i(t) and j(t) in the trial with number t,t = 1,..., 7, 
and let b; be the result of their comparison: b; = 0 if their weights are identical, 
and b; = 1 otherwise. Thus, the results of the comparisons can be written as the 
following system of linear equations in GF(2): 


Mig +xy =h, t= 1,...,T. (3.1.1) 


It is clear that the system can be rewritten in the matrix form 


AX=B, 
where X = (xj,...,X,) and B = (bj,..., b7) are column-vectors, and the el- 
ements a;; of the matrix A = |la;;||,f = 1,...,7, 7 = 1,...,, are random 


variables whose distribution is determined by the sampling procedure. It is con- 
venient to associate the system, or more precisely, the matrix A, with the random 
graph G,,,7 with n vertices that correspond to the variables x1, ..., X,. The graph 
has T edges (i(t), j(t)), t = 1,..., 7. Therefore the graph can have loops and 
multiple edges, depending on the sampling procedure. 
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In this chapter, we consider the characteristics of the graph G,, 7 that are related 
to some of the properties of the system (3.1.1). It is clear that the connectedness of 
the graph is an important characteristic for the classification problem. Indeed, in 
the case where the graph is connected, we can determine all values of the variables 
X1,---;Xn if we set one of them equal to 0 or 1. In both cases, the partitions of the 
set are the same, but the system has two different solutions. In the case where the 
graph G,,,r is disconnected, the system has more than two solutions; therefore a 
complete classification is impossible. 

Now let the vector B consist of independent random variables that take the 
values 0 and 1. If the balance is out of order, the weighings can sometimes be 
wrong, and the variables b;, ..., by can differ from the true values. In this case, 
we obtain a system with distorted entries on the right-hand side that sometimes has 
no solution. If the balance is completely wrong, we may assume that the variables 
b,..., br do not depend on the left-hand side of the system and take the values 
0 and 1 with equal probabilities. In this situation, several natural problems arise. 
Does the right-hand side 5), ..., b7 depend on the left-hand side of the system or 
are the sides independent? Can we reconstruct the real values of x1, ..., x, in the 
case where the right-hand parts 5, ..., by are distorted? 

Let us turn to the second example. Let a vector (c), ..., Cn) in GF(2) be given. 
If we take an initial vector x1, ... , x», then we can develop the recurring sequence 
Xnt+t,¢ = 1,2,..., by the following recurrence relation: 


Xnte = CiXp +++ + CyXn4t-1, t= 12 ase (3.1.2) 


This recurrence relation can be realized with the help of a device called a shift 
register, presented in Figure 3.1.1. A shift register consists of n cells or stages with 
labels 1, 2,...,. The n-dimensional (0, 1) vector of the contents of these stages 
is called the state of the shift register. At an initial moment, the state of the shift 
register under consideration is the vector (x1, ..., X,). The choice of the vector 
(Cc, ..-; Cn) means that we choose the stages with numbers corresponding to the 
ones in the sequence cj, ..., c, and form the mod 2 sum Xy4.1 = CyXy+- ++ +CnXn- 
At the next moment, the contents of all stages are shifted to the left so that x, 
transfers to the stage numbered n — 1, x, _1 transfers to the stage n — 2, and so 
on, x; leaves the register, and the sum X41 = c1X1 +--+ +CnXn is placed into the 
stage with label n. Thus the state (x), ..., x,) transfers to the state (x2, ..., Xn+41). 


Figure 3.1.1. Shift register 
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The process is repeated. Thus, if ci,..., Cn are given, then for any initial state 
X1,...,Xn, the recurring sequence (3.1.2) satisfies 


Xn41 = CyXp ++++ + CnXn, 


Xn42 = C1X2 ++++ + CpXn41, 


XntT = CIXT H+ + CyXn47-1.- 


Let us change the notations and put by = xy4+,¢ = 1,2,...,7, and ay) = 
C],...,@1n = Cn. Then the first relation becomes 


ayix1 +++ +ainxn = by. 


It is clear that we can substitute cyx, +--+ + CyXy for X,+1 in the second relation 
and obtain 


a21x1 +++: +a2nXn = bp. 
In the same way, we obtain 


ax} +--+ +ainXn = 54, 
(3.1.3) 
aryxy +--+: +arnxXn = br. 


Suppose that the initial state (x1, ..., Xn) is unknown and we observe the se- 
quence bj, ..., by. Then we can regard relations (3.1.3) as a system of linear equa- 
tions with respect to the unknowns xj,..., xn. A natural question is how many 
observations are needed to reconstruct the initial state and to obtain all elements 
of the sequence b,,f =7T+1,.... 


The other situation concerns the feedback points c1,..., Cx. Suppose we ob- 
serve the sequence 5,,..., by, but the vector (ci, ...,c,) determining the shift 
register is unknown. If the number of 1’s in (cj,..., Cn) is k, then there are (7) 


possibilities for this vector. If we use an exhaustive search to find the true vector 
that corresponds to the observed sequence, we have the following situation. If the 
chosen vector is true, then system (3.1.3) is consistent for any 7, but if the vector 
(c1,..., Cn) is wrong, then the system becomes inconsistent for some 7. There- 
fore the consistency of the system (3.1.3) serves as a test for selecting the true 
vector. 

Let us introduce the auxiliary notions of a critical set and a hypercycle for 
our investigations of systems of linear equations in GF(2). Note that the ordinary 
notions of linear algebra, such as the notion of linear independence of vectors, rank 
of a matrix, Cramer’s rule for finding the solutions of linear systems of equations, 
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and so on, are extended in the obvious way to the n-dimensional vector space over 
GF(2). For example, if the rank of a T x n matrix A = |la;;|| in GF(2) is r, then 
the homogeneous system of equations 


AX=0, 
where X = (xj,...,Xn) is the column-vector of unknowns, has exactly n — r 
linearly independent solutions. 
Denote by 
ay = (41},..., Ain), telysssals 


the rows of the matrix A. If the coordinate-wise sum 
at, +++-+a,, = 0, 


then the set C = {t;, ..., 4} of row indices is called a critical set. 
If C; and C> are critical sets and Cy # C2, then 


Cy AC? = (Cy UC2) \ (C1 NC) 


is also a critical set. 
Let €),..., &s take the values 0 and 1. Critical sets C),..., Cs are called inde- 
pendent if 


61Cy AeoCn A--- AésC, = 2, 


if and only if e¢; =--- =e, = 0. 
Denote by s(A) the maximum number of independent critical sets and by (A) 
the rank of the matrix A. 


Theorem 3.1.1. For any T x n matrix A in GF(2), 


s(A) +r(A) =T. 


Proof. We consider the homogeneous system of equations 
AY=0 (3.1.4) 


in GF(2), where A’ is the transpose of A. There is a one-to-one correspondence 
between the solutions of the system (3.1.4) and the critical sets: The solution 
Yp,,..5tm = (V1>--+» Yr), whose components y;,,..., ¥z,, are 1 and the other com- 
ponents are zero, corresponds to the critical set C = {tj,...,t%m}. The linear 
independence of solutions corresponds to the independence of critical sets. There- 
fore the maximum number of critical sets s(A) equals the maximum number of 
linearly independent solutions of system (3.1.4), which we know is T — r(A). 

a 
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In addition to the critical sets of a T x n matrix A = |la,;||, we consider a 
hypergraph Gy that is also defined by the matrix A. The set of vertices of the 
hypergraph Gz is the set {1, ...,} of column indices and the set of enumerated 
hyperedges is the set {e,..., er}, where 


& ={j: aj =1}, t=1,...,T. 


Thus there exists a correspondence between a row a; = (@11,...,@tn) and 
the hyperedge e;,¢ = 1,..., 7. Note that the empty set corresponds to a row 
consisting of zeros. 

The multiplicity of a vertex j in a set of hyperedges C = {e;,,..., éy,,} is the 
number of hyperedges in C that contain this vertex. 

A set of hyperedges C = {e;,,..., &,,} is called a hypercycle if each vertex of 


the hypergraph Gy has an even multiplicity in C, in other words, if the coordinate- 
wise sum of rows a@;, + ----+;,, in GF(2) equals the zero vector. 

If each row of the matrix A contains exactly two 1’s, then the hypergraph G4 
is an ordinary graph, perhaps with multiple edges, and a hypercycle is an ordinary 
cycle or a union of cycles. 

The set of the indices of hyperedges that form a hypercycle is a critical set for 
the matrix A. Let €),..., &s take the values 0 and 1. Hypercycles Ci, ..., Cs are 
independent, if 


€1C; AégC2 A---AésCy = @, 
if and only if ¢} = --- = €; = 0. Therefore the maximum number s(A) of critical 


sets of the matrix A equals the maximum number of independent hypercycles 
in G As 


3.2. Matrices with independent elements 


This section deals with random matrices with independent elements. Let A = |ja;;|l 
be a T X n matrix whose elements are independent random variables taking the 
values 0 and 1 with equal probabilities, and let p, (7) be the rank of the matrix A 
in GF(2). The following theorem is the main result of this section. 


Theorem 3.2.1. Lets > 0 and m be fixed integers, m+ s > 0. Ifn — oo and 
T =n-+m™, then 


had 1 m+s 1 -1 
Plon(T) =n —s} > 2-99 TT ( = 7) I] ( : ) 
i=st] i=] 
where the last product equals 1 form +s =0. 


Proof. The limit theorem will be proved by using an explicit formula for 
P{p,(T) = n —s}. Denote by p, (t) the rank of the submatrix of A which consists 
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of the first t rows of the matrix A. We interpret the parameter t as time and consider 
the process of sequential growth of the number of rows. Let & = 1 if the rank 
Pn(t — 1) increases after joining the tth row, and & = 0 if the rank preserves the 
previous value. It is clear that 


Pn(t) = 1 +--+ + &. 


It is not difficult to describe the probabilistic properties of the random variables 
&,..., &r. The event {&, = 1} means that the ¢th row is linearly independent with 
respect to the set of the rows with numbers 1, ...,¢ — 1, and the event {& = 0} 
means that the row with number ¢ is a linear combination of the preceding rows. 
If among the preceding t — 1 rows there are exactly k linearly independent n- 
dimensional vectors, then the linear span of these k vectors contains 2* vectors 
(all linear combinations of these k vectors). The matrix A is constructed in such 
a way that each row can be obtained by sampling with replacement from a box 
containing all 2” distinct n-dimensional vectors. In other words, any row of the 
matrix A is independent of all other rows and is equal to any n-dimensional vector 
with probability 2~”. Therefore 
ok 
P{é&, =0| pn(t — 1) =k} = an 
(3.2.1) 
ok 
P(E = 11 one —1) =k} = 1-5. 

Thus the process p,(t) is a Markov chain with stationary transition probabilities 
that are given by (3.2.1). To find P{p,(T) = n — s}, we can sum the probabili- 
ties of all trajectories of the Markov chain that lead from the origin to the point 
with coordinates (n + m,n — s), that is, the trajectories such that p,(0) = 0, 
Pn(n +m) =n — s. If we represent a trajectory as a “broken line” with intervals 
of growth and horizontal intervals, we see that any such a broken line has exactly 
n+m-—(n—s) =m-+s horizontal intervals corresponding to m + s zeros among 
the values of &1, ..., &:-4m. The graph of the trajectory with &, = 0,..., &,,,, = 0 
is illustrated in Figure 3.2.1. 

By using (3.2.1) and Figure 3.2.1, we can easily write an explicit formula for the 
probability of a particular trajectory and for the total probability. The derivation 
of this probability is quite simple if m + s = 0. Indeed, the only trajectory with 
Pn(0) = Oand o,(n +m) = n+m has no horizontal intervals, and at each interval 
the broken line increases; therefore 


(-)-2)(-29 


P{o,(n +m) =n-—s} 


II 
= 
oo~ 
— 
| 
|= 
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ty t tn+s t=n+m 


Figure 3.2.1. Graph of the trajectory with &, =--- = &,,, =0 


and in the case m+ s = 0, asn — 00, 
set 1 
P{o,(n+m)=n-s}—> I] (1-5). 
i=s+l1 


This coincides with the assertion of the theorem for m + s = 0 because the last 
product equals 1. 
In the general case, form + s < 0, 


Plo,(n +m) =n—s} 


= >» Piéi =1,..., 8-1 = 1,8, =0, 41 = 1,...) 


1 St) <+++<tm4s<n-+m 
1 9t—(mt+s)—1 
z 0-3) 
2” 2” 
1St) <++<tm4s<sn+m 


tir 14 2-24+-+Htm4s—m—s 


* Qn(m+ts) 
t—(m+s)—1 ok 
= 2-s(m+s) I] (1 a = 
Qn 
k=0 
x par gti ltt—24+--+tnts—m—s 


1 St) <-+<tn4sSnt+m 
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Taking the factor 2“—5)("+5) out of the sum yields 
Plon(n +m) =n —s} 


n 
= 275(mts) I] (1 oa =) 


ix=s+l 
x = Q—(—s)(mts)+t lt ttmts—m—s 


1st) <:\<tm4s<n-+m 


As will be seen from the following evaluations, the moments f1,...,tm+s are 
concentrated at the end of the trajectory; therefore, in the sum of the formula, it is 
convenient to switch to the variables 


ip =—(j -l+s—n), L=1,...,m+ts. 
It follows from 1 < t) <--+ < trys <n-+™m that 
O<t -—1<t-2<-+--<tnis—-m—s<n-s, 
and by subtracting n — s from each term, we obtain 
—n+s<t);-l—n+t+s---<tn4s —-m—s—nt+s <0. 


If we change the sign, we see that the domain 1 < 4) <--- <tmis <n+min 
terms of the new variables is 


OF ings Se SY Sas. 
Thus 


P{o,(n +m) =n-—s} (3.2.2) 


= 275(m+s) Il (1 = x) os Qaimts— A 


i=s+l OSim4s<- Sis <n—s 


It is easily seen that, as n — ov, 


Il (: a x) > |] (1 = x). (3.2.3) 
and 


eee ee (aed) 


OSim4s S0-Si) Sn-s OSimtsS--Sh 


To complete the proof it remains to transform the right-hand side of (3.2.4). It 
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is not difficult to see that 


S Qo ini 


O<i-<--<i2<i1 


= a Qrirm wha > 2741 


OSi, <--<in in <i; 


(Zee 


0<i, <---<i3 i3<i2 


1 vl 1 1 , in —3i 
pe ee y Thy mem lg — Ot 
(1 ) (1 z) ae 


r 1 —1 
= (1 - 7) : (3.2.5) 
i=] 


Passing to the limit in (3.2.2) and taking into account (3.2.3), (3.2.4), and (3.2.5) 
provide the assertion of the theorem. a 


Let the elements of a T x n matrix A = |la,;|| be independent and take the 
values 0 and 1 with equal probabilities. We consider the system of equations 


AX =0 (3.2.6) 


with respect to unknowns X = (xj, ..., X») in GF(2). Denote by v,7 the number 
of linearly independent solutions of this system of equations. If the rank p,(T) 
of the matrix A equals r, then v,,7 = n — r. Therefore Theorem 3.2.1 yields the 
following assertion. 


Theorem 3.2.2. Let s => 0 and m be fixed integers, m+ s > 0. Ifn — oo, then 


asd 1 m+s ie oe 
P{Un,.n+m = 5s} > pan IT (1 — =) Il ¢ = x) ’ 
i= 


i=s+l1 


where the last product equals | form+s=0. 


In particular, form = s = 0, 


0° 
1 
P{vn.n = 0} > I] (1 ra x) = 0.28878816.... 


i=] 
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The results of Theorems 3.2.1 and 3.2.2 are of special interest because they are 
stable in the sense that the limit distribution of the rank of a matrix is invariant 
with respect to deviations of the distributions of its elements from the equiprobable 
distribution. 


Theorem 3.2.3. Let the elements of aT x n matrix A = ||a,;\| be independent 
and suppose there is a positive constant 6 such that, for the probabilities pi? a 
P{a;; = 1}, the inequalities 


$< py) S18) PSs Ts. JE daccom, 


hold. Let s > 0 and m be fixed integers, m + s > 0. Then, asn — ov, 


0° 1 m+s 1 as | 
P{on(n +m) =n—s} > 27-545) I] (1 = x) I] (1 = =) ; 


i=s+l1 i=] 


where the last product equals 1 form +s =0. 


Because these results are outside of the main combinatorial direction of this 
book, we will omit the complicated proof of this theorem (see, e.g., [93]). 

We illustrate the situation by proving that, under the conditions of Theorem 
3.2.3, the mean value of the number of nontrivial solutions of system (3.2.6) is 
invariant to deviations of the distributions of elements of A from the equiprobable 
distribution. 

Let 44,,7 be the number of nontrivial (i.e., nonzero) solutions of system (3.2.6). 
If we associate to the vector X an indicator that is 1 if X satisfies the system, then 


Epin.r = > P{AX = 0}. 
X#0 


We will evaluate Ez, 7 by using the following lemma on summation of inde- 
pendent random variables in GF(2). 


Lemma 3.2.1. Let &,,...,&, be independent random variables that take the val- 
ues 0 and 1 with probabilities 
1-A; 1+ A; 
PE == P(g = 0} = ——, Tench 


Then, in GF(2), 


1 ahyaeiy 
Pit +--+ = 1} = —— 


Proof. It is clear that it suffices to prove the assertion of the lemma for n = 2. In 
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that case, 
P(é) +& = 1} = P(& = 1, & =0} + P(& =0, & = 1} 
(1—A;)(1+A2) (1+ A1)C. — Az) 
4 oa 4 


= 1— A;A2 
= ; : : 


If the elements of A are independent and take the values 0 and 1 with equal 
probabilities, then by Lemma 3.2.1, for any X 0, 


P{AX = 0} = (Ployix1 +--+ 01nx%n = 0})7? = 277. 


Therefore Eu, 7 = (2” — 1)2-7 , and for T = n +m, where m is a fixed integer, 


1 1 
Eun ntm = 5m = an+m? 
and as n — ov, 
EUn.n+m > 3m" 


Under some conditions on the nonequiprobable distribution of the matrix A, 
the last result still holds. Let 


By = Pla; = 1}, 


and, as before, denote by jz,,,7 the number of nontrivial solutions of system (3.2.6). 


Theorem 3.2.4. Under the conditions of Theorem 3.2.3, 
Eun >2”. 


Proof. By using the indicators as in the calculation of the mean number of solu- 
tions in the equiprobable case, we find that 


Eunr= > P(AX=0}=>, D  Phewks (3.2.7) 


XA0 k=1 15j)<--<jg<n 


where, for any fixed set {j},..., j,} from the domain of summation, the term 
Py,...., j¢ = P{AX = 0} corresponds to the vector X = (x1, ..., Xn) whose ele- 
ments with indices /;,..., jg are 1 and the remaining elements are zero. 

We represent the probabilities De as 


1— Aj; 
(2) _ tj 
Ryo 
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According to the conditions of the theorem, there exists A < 1 such that | A; fl A 
for all t and 7. Since the rows of A are independent, 


where 


Pin cocjp = Plot; + +++ + tj, = O}. 


By Lemma 3.2.1, 


1+ Ary +++ Arie 


Ploy +++ +o, = 0} = 2 


and for all t and 1 < jj <--- < jp <n, 


ba ow: Ieee! 
20 Ade 


Hence, for P;,,..., 4, we obtain the bounds 


peees 


By using these inequalities, we find from (3.2.7) that 


" (n\ (1—AE\? Ee fa\ (TE 
£0)(S8) smcek (C2). om 


k=1 


Now let T = n + m, where m is a fixed integer. The left and the right sides of 
(3.2.8) can be estimated in the same way. Therefore we obtain only an estimate of 


the right-hand side. Let 
n ky ntm 
n 1+A 
S(A) = 
Oe Ol) 


and compare S(A) to 
“.(n 1 
SO); (1) ym 
k=1 


We have seen that S(0) > 27” as n — oo. We show that for any fixed A, 
0 < A < 1, the difference S(A) — S(O) tends to zero. We divide S(A) into 


134 Systems of random linear equations in GF(2) 


two parts: 


1 AE n+m 
coe ECE)” 


1<k<en 


k\ntm 
wore © SE)”. 


en<k<n 


where €, 0 < € < 1/2, will be chosen later. For the sake of simplicity, suppose 
that € is such that en is an integer; then for any e and A,0 < A < 1,0 < 
é < 1/2, 


1+ Ak\"*™ PAV 
sa= © Ga) =") () 


1<k<en 


- (ita me enn®” 
= 2 (en)&" ./ene =" 


by using the inequality n! > n”./ne~". This bound for S;(A) can be written as 


Si(A) < veni(+4*) (Ga) 


2 2efe-é 


If we choose a sufficiently small ¢, we can make the value (1 + A)/(2e%e~*) less 
than 1. For such ¢, the bound tends to zero as n — oo. Thus, there exists a fixed 
€,0 <e < 1/2, such that the value S;(A) and, consequently, S;(0) tend to zero, 
and S,(A) — S;(0) > 0. 

We now estimate the difference S)(A) — S)(0). It is clear that 


n 1 k 
> (wi tah—-p 


0 < S(A) — S2(0) 


é<k<n 

pa (aa 0 
_ kj Qntm 

e<k<n 
= ényn+m _ n 1 
= (1+ A”) nD > (QF 


en<k<n 


IA 


1 
sat + gies abl = 1). 


Since (1 + A&)"*™ -» 1 asin —> 00, it follows from the estimate obtained 
above that S.(A) — S)(0) — 0. Thus we have shown that S$(A) — S(0) — 0 and 
S(0) > 2~™; hence, S(A) > 27”. Theorem 3.2.4 is thus proved. a 
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We can actually relax the hypotheses of Theorem 3.2.4. The result remains true 
iffort=1,...,7, 7=1,...,n, 


logn + Xn 2 p® Se: logn + Xn 
a tj Fins cy 


n n 


where x, tends to infinity arbitrarily slowly (see [93]). These bounds are exact 
in a sense because, as we will show in the next section, the limit distribution of 
the rank of a matrix A differs from the distribution given in Theorem 3.2.1 if the 
probability of 1’s does not satisfy these inequalities. 


3.3. Rank of sparse matrices 


In Section 3.1, we introduced the notion of critical sets of a matrix. Recall that a set 
{t1,..., tm} of row indices of a matrix in GF(2) is called critical if the coordinate- 
wise sum of rows with indices 1, ..., t is the zero vector. The notion of indepen- 
dence of critical sets was also introduced, and s(A) denoted the maximum number 
of independent critical sets of a matrix A. According to Theorem 3.1.1, the rank 
r(A) of a matrix A is related to s(A) by the equality s(4) + r(A) = T. Therefore, 
instead of the rank of a matrix, we can investigate the maximum number s(A) of 
independent critical sets of the matrix. 

In this section, critical sets are applied in the analysis of the rank of random 
sparse matrices. Let the elements of a T x n matrix A = |lq;;|| be independent 
random variables such that 


Pla; = y= eRe P(aj =0}= 1-2" =", (3.3.1) 


where x is aconstant,t = 1,...,7, j= 1,...,n. We find the limit distribution 
of s(A) for such a matrix. 


Theorem 3.3.1. Ifn,T — o© such that T/n — a, 0 < a < 1, and condition 
(3.3.1) is valid, then the distribution of the maximum number of independent critical 
sets s(A) converges to the Poisson distribution with parameter . = ae”. 

We show first that the distribution of the number of critical sets that correspond 
to zero rows of the matrix converges to a Poisson distribution. Denote the number 
of zero rows of the matrix A by &,,7. 


Lemma 3.3.1. Ifn,T — oo such that T/n — a, 0 < a < ©0, and condition 
(3.3.1) is valid, then for any fixedk =0,1,..., 


kaa’ 


Me 
Plén.7 =k} > [ 


where =ae™*. 


136 Systems of random linear equations in GF(2) 


Proof. The probability p, that a fixed row consists entirely of zeros is 


logn+x\" 
pa (i eetsy’ 


and under the conditions of the lemma, 
1 
Pn = ae +o(1)). 


The random variable &,, 7 has the binomial distribution with parameters (7, pn), 
where T is the number of trials and p, is the probability of success. Under the 
conditions of the lemma, the mean number of successes Tp, tends to we~*; 
hence, the binomial distribution converges to the Poisson distribution with parame- 
terae-*. a 


We now prove that if a < 1, then with probability tending to 1, all critical sets 
consist of only zero rows. 


Lemma 3.3.2. Ifn, T — oo such that T/n > a, a < 1, and condition (3.3.1) is 
valid, then with probability tending to 1, the critical sets of A consist of only zero 
rows. 


Proof. We consider the total number of critical sets in which each contains at least 
one nonzero row. It is sufficient to prove that the mathematical expectation of this 
number tends to zero. Although the proof of this fact is straightforward, it involves 
many cumbersome estimations of sums containing the binomial coefficients. 

An even number of successes among k independent trials with probability of 
success p occurs with probability (1 + (¢ — p)*)/2. 

Let us find the probability that k fixed rows form a critical set containing a 
nonzero row. The indices of these rows form a critical set if each column of the 
submatrix formed by these rows contains an even number of 1’s. According to the 
remark on the probability that the number of successes is even, this probability 


equals 
1 (: 2 (1 _ 2dogn +2)') 
2 n 


Therefore the probability that these & rows constitute a critical set equals 


k n 
‘3 (1+(1- 2(logn +») 
2” n 


Note that the probability that there is no 1 in all these & rows is equal to 


1 logn +x\*" 
= : 
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By using the corresponding indicators to represent the total number of nontrivial 
critical sets and the number of the critical sets that consist of zero rows, we obtain 
the following expression for the mean number of critical sets that do not consist 


of zero rows: 
T T kn 
T\ 1 T l 
oy 6 tag ee fee 
kj 2 k n 


where 


_ 2(logn + 2 


n=1+(t 


We include the terms with k = 0 into these sums because they cancel each other. 
Note first that 


3 (7) (: _ logan ae e (: r (: 7 ants)" 
part k n n 


and under the conditions of the lemma, 


n\ T T 
(1 + (1 _ ments) ) = (1 + li-s +o (-)) =e "(1 + o(1)). 
n n n 


Now consider the sum 


i T k\” 
3 ; (7)z( ( “ace” 
(0 )ar? = fits). Mig 

mae 2 oy k}2 n 


Seta = 1 — 2(logn + x)/n for now. The following equalities hold: 


o (Tet = H(i) ee rey" =O (EMG)! 
= = (ed (he" - D (jal tal 


and divide the sum 


S(n,T) =~ (i) tab)? 


k=0 
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Figure 3.3.1. Graphs of the functions (2 and re 


into five parts so that 


S(n, T) = S, + + 83+ 844+ Ss, 


where 
S= Da b= Via B= YO we, 
O<k<ky ky <k<ko ko <k<k3 
re ae 
k3<k<kg ka<k<n 

n n 1 n 1 
k= , b= =s); bo == 1/2+1/10 | h=- me L/2+1/10 | 

1 =en 2 5 £), ks rei 4 5 tn 


and the value of € will be chosen later. For convenience we present the graphs of 
the functions (;)2~” andr? = (1 + (1 — 2(logn + x)/n)*)? as functions of & in 
Figure 3.3.1. 

The major contribution to S(n, T) is made by the sum Sj. It is clear that 


(1 _ 2(logn + x) 


k 
1 
) = e7k(logn+x)/n (4 4 9(1)) = —e-* (1 + o(1)) 
n n 


uniformly in the integers k = n/2 + u/n/2 such that |u| < n!/!°. These & form 
the domain of summation of S4, which equals {k: |u| < n!/!°}. Therefore 


2(1 mt 
r= (14 (1-22) 
n 
(1+;er40(2)) 
1+-e“"+o[- 
n n 


e***(1 + 0(1)) 
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uniformly in & in the domain of summation of S4. Thus 


n\ 1 n\ 1 =x 
“= > (rad =. 5 (are (1 + o(1)) 
k:|u|<n'/10 kJul<n1/10 
ae~* n\ 1 ae* 
— —(1 1) = 1 1 
e ss Ca +0(1)) = e "(1+ 0(1)), 
k:Juj<n1/10 


since by the de Moivre—Laplace theorem, 


n\ 1 


We now have to show that the remaining four sums tend to zero. We begin with 


n\ 1 


k>n/2+n1/10, /fn/2 


k:|uj<n1/10 


Since ris monotone, we find that 


n\ 1 
kiu>ni/10 


Under the conditions of the lemma S; — 0, since, as was proved, rk, —> ee", 
and according to the de Moivre—Laplace theorem, 


n\ 1 
k:u>n1/10 


n\ 1 
O0<k<k, 


By using the monotonicity of re , we find that, for sufficiently small ¢ such that en 
is an integer, 


S<n >, (Js ae (;) < a+ko(;)27 


k<k, k<k, 


Let us estimate 


. (1 + en)n®27 aes QT/n \” 
~ (en)®" Jene—e"2"  ~ 2efe& ) 


It is clear that 27/" (2e%e-*)-! < q < 1 for sufficiently small ¢; therefore 5; > 0 
asn — OO. 
It remains to consider S) and S3. Let us begin with 


= > ax. 


en<k<n(1—e)/2 
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We first show that a; is a monotone increasing function for k such that en < k < 
n(1 — €)/2. Indeed, 


k+l (het 
aK (rt 
n—k (: + (1 — 2dogn =: 
k+1\ 14+ (1 —2(ogn + x)/n)k 
 non(l—)/2 
= nl —e)/2—-1 


x (1 _ (= 2dogn +.x)/n)k ~ (= 20ogn + x)/n)! ) 


1+ (1 — 2(logn + x)/n)* 
l+e 
SW 
~ 1—~e4+2/n 


; ( _ (= 2(logn + x)/n)k — (1 — 2dogn + x)/n)*+! ) 
1+ (1 —2(logn + x)/n)* . 


Since 1 + (1 — 2(logn + x)/n)* > 1, we obtain 


Ag+ l+e 
ak l1—e+2/n 


k T 
. (: 2 (: 7 “een +9) (: ee “ice +2))) 


T 
_ ite 1 2dogn+x) (, _ 2dogn +x) ' 
~ l-e+2/n n n 


For sufficiently large n, 
(+e6)/(l—e+2/n)>1+6. 


Moreover, for k satisfying en < k < n(1 — €/2), 


k 
(1 _ 2dogn +2) < e72klogn-+x)/n 


2e 
= ’ 
n 


< cn” 


where c is the constant e~?**. 
Thus, for sufficiently large n, 


ak+1 2clogn +.x)\7 
“Ht sa +e)(1- SOR > (1+e)(1—¢/2) > 1. 
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If we estimate S), we can use the monotonicity of a, to obtain the inequality 
S2 < nak. 


Let us estimate 


aa 2logn +x)\”\" 
an = (fae (1+ (1-*2)") 


Since a rough estimate is acceptable, we content ourselves with the bound 


Oe Oie OE 


kius—n1/10 


Sip e~”/2 du(1 + 0(1)) 
Qn —o0O 


1 
~ V2ani/0 


Here we used the well-known asymptotics 


en ?/24 4 o(1)). 


Zs 1 
/ el dy = fe Fly + o0(1)) 
2% Zz 
as z — oo. Thus, there exists a constant a such that 
n\ 1 —nW/5/2 
> (7) an <de ‘ 
Let us estimate the second factor of a,,. It is clear that 


k T 
(: + (1 < 2(logn +) ) < (1 +e Aallogntx)/nyt 
n 


= (1+ eC -eogn tay)? ee gins ee coe 


where b is a positive constant. 
By combining the estimates of the two factors of az,, we obtain the bound 


—n'/5/2_bn® 
Sy < nap, < nae” [2g 


and $2 — Oif we choose e < 1/5. 
It remains to estimate 


k T 
si (Ma (r+ (1-2) 
ka <k<k3 a 
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T 
2(logn + x) ke n\ 1 


€ _y»l/s 
< eon ae" [2 


It is clear that 


and $3 > Oife < 1/5. | 


Proof of Theorem 3.3.1. The assertion of Theorem 3.3.1 follows from Lem- 
mas 3.3.1 and 3.3.2 because by Lemma 3.3.2, 


P{s(A) =&i,7} > 1 
under the conditions of the theorem. | 


The following theorem is a corollary to Theorem 3.3.1. Suppose that 


logT +x (n) logT +x 
a co <1 - 2, 3.3.2 
TP 5 T (3.3.2) 
where x is aconstant andt = 1,...,7, j=1,...,n. 


Theorem 3.3.2. If n,T — o© such that T/n — a, 1 < a < ©, and condi- 
tion (3.3.2) is valid, then the distribution of s(A) converges to the Poisson distri- 
bution with parameter . = e~* /a. 


Proof. Since the rank of a matrix is the maximum number of linearly independent 
rows or columns, we apply Theorem 3.3.1 to the transpose matrix and obtain the 
assertion of Theorem 3.3.2. a 


Because we know the limit distribution of the rank of a matrix A, we can obtain 
some results for the behavior of the solutions to the system of linear equations with 
the matrix A. Let us consider the system 


AX=B, (3.3.3) 
where the elements of the T x n matrix A = |la;;|| are independent, and for 
tol. Taf = dyke hh 

logn+x 
Play = 1) = 2" =, 
n 
where x is a constant, the column-vector B = (bi, ..., br) is independent of A, 


and the random variables b;,..., by are independent, taking the values 0 and 1 
with equal probabilities. 

Denote by ji the number of solutions of the system (3.3.3). The examples 
cited in Section 3.1 show that the consistency of linear systems plays a particular 
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role in some of the problems related to such systems. The probability of consistency 
P,,7 Of system (3.3.3) is the probability that the system has at least one solution: 


Py.r = P{uyir > 0}. 
By using Theorem 3.3.1 we can easily prove the following assertion. 


Theorem 3.3.3. Ifn,T — 00 such that T/n — a, 0 < a < 1, and condition 
(3.3.1) is valid, then 


Pat > ee /2, 


Proof. If the rank r(A) of A equals r, then 
P{uv, > 0|r(4) =r} = 2°77, (3.3.4) 


Indeed, let the linearly independent rows have the indices 1, 2,...,r. Then each 
of the rows with indices r + 1,..., 7 is a linear combination of the first r rows, 
and for the system to be consistent, each of the right-hand parts 5,41, ..., b7 must 
satisfy a linear relation of the form 


€4¢b, +--+ + &r¢by = Dy, t=r+1.,...,T7, (3.3.5) 


where €1;,..., €y¢ are constants taking the values 0 and 1. The probability of the 
validity of any of the relations (3.3.5) is equal to 1/2 and, hence, assertion (3.3.4) 
is true. 

Since {r(A) = r} = {s(A) = T —r}, by the total probability formula, 


T T 

1 1 

Pat — ) P{r(A) = ay es = ) P{s(A) = Shas (3.3.6) 
r=0 s=0 


The last series from (3.3.6) is majorized by the series }“>> 2~* and converges 
uniformly. Therefore it is possible to pass to the limit under the sum in (3.3.6). 
Passing to the limit with the help of Theorem 3.3.1 yields 


Me™ a/2 
ER Dg 
s=0 


where A = ae~*. |_| 


3.4. Cycles and consistency of systems of random equations 


In this section, we consider a system of T equations in GF(2): 


MQ +xm=h, t=1,...,T7, (3.4.1) 


144 Systems of random linear equations in GF(2) 


where i(t), j(t), ¢ = 1,..., 7, are independent random variables that take the 
values 1,...,” with equal probabilities, and the variables 6),..., Br take the 
values 0 and 1. We denote by A,,7 the matrix of this system. As in Section 3.1, 
we associate the matrix A,,7 to a graph G, 7 with n labeled vertices that cor- 
respond to the variables x1,...,%,. The graph has T edges (i(t), j(t)), ¢ = 
1,..., 7. Thus the edges of the graph G,,7 may be considered an outcome of 
T independent trials: In each trial, an edge joins two different vertices i and 
j with probability 2n—? and forms the loop at a vertex i with probability n~?, 
i, j= 1,..., 7. Thus the graph G,, 7 is the same as the graph considered in Sec- 
tion 2.3. 

Denote by 2,7 the number of solutions of the system (3.4.1) and consider the 
probability of consistency 


Pa, a P{un,r > 0}. 


We want to express P, 7 in terms of the characteristics of G, 7. Denote by x,,7 
the number of components of the graph G,, 7. 


Theorem 3.4.1. If 6, ..., Br are independent random variables that take the 
values 0 and 1 with equal probabilities and do not depend on Ay,7, then 


ih ee 1 
Pat = Fn So Plz = KE 
k=1 


Proof. We first assume that G,, 7 is a connected graph. We can then choose a tree 
that is a skeleton of the graph. This tree contains n — 1 edges that correspond to 
a subsystem containing n — 1 equations of the system. If we assign a fixed value 
to one of the unknowns, then with the help of the corresponding subsystem, we 
obtain the values of all other unknowns. Consequently, the right-hand sides of 
the remaining T — n + 1 equations must each take a fixed value for the system 
to be consistent. Since 6,,..., Br are independent and take the values 0 and 
1 with probabilities 1/2, the probability of consistency is (1/2)7~"+! for Gn,T 
connected. 

Now assume the graph G,,,7 consists of k components with m1, ...,% vertices 
and 7|,..., 7; edges, respectively. The whole system is consistent if and only 
if each of its subsystem is consistent. Under the condition that the number of 
components x,,7 = k and, consequently, that the system decomposes into k 
disjoint subsystems, the probability of consistency is 


1 1 1 1 


QTi—m +1 QT—ngtl ©" * DT —ng +l ~ 9T—ntk* 


When we apply the formula of total probability, we obtain the assertion of the 
theorem. | 
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According to Theorem 3.4.1, the number of components of the graph G,,7 can 
be used to investigate the system (3.4.1). Likewise, we can consider the maximum 
number of independent critical sets s(A,,7) introduced in Section 3.1. According 
to Theorem 3.1.1, the maximum number of independent critical sets s(An,7) and 
the rank r(A,,7) of the matrix A,,7 are related by the equality 


S(An,7) +r(An,r) = T. 
It is not difficult to prove that 
4n,T =n T + s(An,7r); 


and therankr(Ay,7) = n—%n,7r. Thus, the assertion of Theorem 3.4.1 is equivalent 
to relation (3.3.6). 

We remarked in Section 3.1 that a critical set of A,,7 corresponds to a cycle 
or a union of cycles in the graph G,,7, and the maximum number of critical sets 
S(An,7) equals the maximum number of independent cycles. 

The graph G,,7 was studied in Section 2.3. We have seen that ifn, T — oo 
such that 27/n —> 2,0 < A < 1, then with probability tending to 1, the graph 
has no components with more than one cycle. Therefore, under these conditions, 
all cycles of G,,7 are isolated and, consequently, independent. As in Section 3.1, 
we denote by v(G,,7) the number of cycles in G,,7. 

It was proven (see Theorems 2.3.3 and 2.3.4) that if 27/n > A,0 <A < 1, 
then 


P{v(Gy,r) = s(An,r)} > 1, (3.4.2) 


and for any fixedk =0,1..., 


P{v(G,,7) =k} > 2 (3.4.3) 


where 
A = —3 log(1 —A). 
These results allow us to analyze the probability F,,,7 of consistency of the system 


(3.4.1). 


Theorem 3.4.2. Ifn, T — oo such that 2T/n > 2,0 <2 < 1, and the right- 
hand sides B,..., Br of the system (3.4.1) are independent random variables 
that take the values 0 and 1 with probabilities 1/2 and do not depend on Ay,7, 
then 


Php > 1 —ay'4. 
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Proof. When we use Theorem 3.4.1 or the equivalent formula (3.3.6), we find 
that 


n n 
1 1 
Pat = 2 P{%n,7 = a by eee = y P{r(An,r) = ary 
=] r=0 


Z 1 
= DUP Is(An,r) = s}55- 


s=0 


Taking into account (3.4.2) and (3.4.3) and passing to the limit under the sum 
yield 


te * A/2 1/4 
Prt > >> S18 =e =(1—A)*"". 
s=0 


In the same way, we can treat the nonequiprobable case, where the indices 
(i(t), j/@), t = 1,..., 7, of the variables of system (3.4.1) are independent 
identically distributed random variables that take the value i with probability p;, 
i=1,...,”, py +-:-+ pn = 1. As before, let the right-hand sides 6),..., Br 
be independent, take the values 0 and 1 with equal probabilities, and not depend 
on A, 7. We retain the notation P,,7 for the probability of consistency of such a 
system. 


Theorem 3.4.3. Let p; = aj/n, where aj = aj(n), 0 < & < aj < & < ©, 
i=1,...,n, &9 and & are constants, and let 


1 n 

a* = lim — y an 
n>OOn * 1 
i= 


Ifn, T — 00 such that 2T/n — d and a*d <1, then 


Pat > (1 Bee es 


Proof. In Section 2.4, the nonequiprobable graph G,,,7 corresponding to the ma- 
trix An 7 was considered. The graph contains n labeled vertices and T edges that 
can be obtained by the following T independent trials. In each trial, one edge is 
drawn. The edge connects two different vertices i and j with probability 2 p; p;, and 
a loop at a vertex i is formed with probability Be i, J=1,...,", pit---tpn=1. 

According to Theorem 2.4.1, under the conditions of Theorem 3.4.3 for any 
fixedk =0,1,..., 

Ae eA 


PONG a) Bb res 
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where v(G,,,7) is the number of cycles in G,,7, and 
A = —} Iog(1 — aa). 


If we reason as we did in the proof of Theorem 3.4.2, we obtain the assertion 
of Theorem 3.4.3. | 


The proofs of Theorems 3.4.2 and 3.4.3 are mainly based on assertion (3.3.4) 
that 


Plitn,r >O|r(4nr) =r} =27". (3.4.4) 


The proof of this assertion in Section 3.3 used the fact that if r rows are lin- 
early independent and r(A,,r) = r, then each of the remaining rows is a lin- 
ear combination of these r rows, and the system is consistent only if the corre- 
sponding right-hand sides satisfy a certain linear relation. If the right-hand sides 
Bi,..., Br are independent, then such a relation is satisfied with probability 1/2, 
and the events corresponding to different relations are independent. In other words, 
each cycle in G,,7 imposes a restriction on the right-hand sides f1,..., Br, 
these restrictions are independent, and each of them is satisfied with probabil- 
ity 1/2. 

If the right-hand sides 6, ..., Br take the values 0 and 1 with unequal prob- 
abilities, then property (3.4.4) is not valid, and the corresponding formula for the 
probability P, 7 of the consistency of the system becomes more complicated. In 
this section, we prove the following assertions. 

Let 


Put (k) => P{un,r > 0, v(Gu,T) = k}, Pat = Plitn,r > O}. 


Theorem 3.4.4. Let the right-hand sides B,,..., Br of the system (3.4.1) be 
independent identically distributed random variables that take the values 0 and 1 
with probabilities 1 — p and p, respectively,0 < p<1, A=1-—2p. 

Ifn,T — oo such that 2T/n —> 3,0 < A < 1, then for any fixed k = 
0,1,..., 


Py. (k) > (—log(1 —)(1 — Aad))*§.V/1 = A, 


1 
4kK! 
2% 1/4 
P 
ana (3) 


Theorem 3.4.5. Let the right-hand sides B,,..., Br of the system (3.4.1) take 
the values 0 and 1, and let m = m(T) be the number of 1’s in B,..., Br. 
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Ifn, T — oo such that2T/n > 4,0 <2’ <1, andm/T > p,0 < p <1, 
then for any fixedk =0,1,..., 
1 
4k i! 


1-a \'4 
P, yh 5 
oad (=) 


where A = 1— 2p. 


Pa tlk) > (—log(1 —a)(1 — Aay)*¥ V1 — A, 


Before proceeding to the proof of these theorems, we will establish some aux- 
iliary results. Let 6,,..., Br be independent identically distributed random vari- 
ables that take the values 0 and 1 with probabilities 1 — p and p, respectively; let 
A = 1—2p;and let E be the set of the even numbers. Let ro = 0 and 7,..., 7% 
be positive integers. We consider the random variables 


Ni = Brotetrj tl tee + Brotentr, E=l,...,k. 


Lemma 3.4.1. 
1 
P(n, € £,i=1,...,k}= a+ AM): + A’), 
Proof. It suffices to note that the random variables 71, ..., 7, are independent 


and that the probability of the event of the sum 6 + --- + 6, being even equals 
(1 + A’)/2. | 


When the variables 6;,..., Br are nonrandom, we need a similar assertion for 
the following scheme of allocating m particles into T cells. The cells are divided 
into k+ 1 groups of cells containing 71, ...,7%, T —r1 —---—rx cells, respectively. 
We assume that each cell can contain at most one particle, thatm < 7, and that each 
of (7) possible allocations are equiprobable. We introduce the random variables 
&,...,&r, setting &; = 0 if the cell number i is empty, and & = 1 otherwise, 
fori = 1,..., 7. By analogy with the random variables 71, ..., 7, we define the 
random variables 


bp = Spy try_ tl to + Sptety, E=1,...,k. 
It is not difficult to verify the following assertions. 
Lemma 3.4.2. Ifr,,..., 1, are fixed, T > o0, andm/T — 0, then 
P{¢;¢ £, i=1,...,k} > 1. 


Lemma 3.4.3. Ifr,,...,1r, are fixed, T > co, andm/T — 1, then 
P{g, ¢ F,i=1,...,k} > 1 
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ifallr,,..., rg are even; and 
P(g; ¢ E,i=1,...,k} ~ 0 


if at least one of r1, ..., rj; is odd. 


Lemma 3.4.4. Ifr,,..., rj; are fixed, T > oo, andm/T —> p, 0 < p <1, then 
1 
P(g; € E, P=1,...,4) > G+") (144%), 
where A= 1—2p. 


We now consider the graph G, 7 and mark the cycles in the graph by the 
following rule. Recall that A,_7 is the set of all graphs with n labeled vertices and 
T edges whose components are trees and unicyclic components, allowing cycles 
of length 1 and 2. If a realization of the graph G,,r belongs to the set A, 7, then 
every cycle of length r is marked with probability p, independently of the others. 
If the graph contains a component with more than one cycle, then no cycle of the 
graph is marked. We denote by p,,,7 (k) the probability of the event that the number 
of cycles v(G,,7) in the graph G,,r is equal to & and all cycles are marked. It is 
clear that the probability p,,7 of the event that all cycles are marked equals 


foe) 
Pa,T = > Par (k). 
k=0 
As in Section 1.7, we denote by d,, the number of mappings of the set {1, ... , m} 
into itself whose graphs are connected, and by d®) the number of mappings of 
the set {1, ..., m} into itself whose graphs are connected and contain a cycle of 
length r. Let F,,~ denote the number of forests with n labeled vertices and N 
trees, T=n—N. 
Explicit expressions for d, and Gq are well known. By using the formula for 
the number of rooted trees, we obtain 


m! 
d = 7 mrt. 
i (m —r)! 


hence, 


m m-1 nh 
dm = Yd? = (m - VP Dare 
k=0 


r=1 


Lemma 3.4.5. For any integer k, 1 < k < min(n, T), 


Tt {2¥¢ Sin m! Dn +++ Day 
par) = sr (5) (7) Fim oe mil---mul 


m=1 mi+e-+myp=m 
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where 
m 


Dn => dO pn 


r=1 


and for k = 0, 


2 T 
Pn,T(O) = Fn,wT! (=) ; 
n 


Proof. For k = 0, the assertion is obvious. As in Section 1.7, let us denote by a”? 
the number of connected graphs with labeled vertices and one cycle of length r. 
It is clear that 


KD =a), pM@=d@, pM =aM]2, r>3. (3.4.5) 


Denote by C,,,7 the event that the graph G,,7 contains no unmarked cycles. We 
represent the event 


{v(Gn,r) =k, Gar € An,r, Cnr} 


as a union of the following disjoint events: Ina specific order, T trials give T fixed 
edges that form a graph consisting of trees and k unicyclic components, including 
a marked cycle. It follows from this description that 


Pn,T(k) = P{v(G,,7) =k, Gn,T € An,T, Cn,7} 


“(n m! 
a 81 (8 ee Saree 


m=k m+--+mg=m 


mj T 
2 1 
x > BED py . SO py n—m,T—mI! (=) aa +8 ’ 


rj=l rp=1 
where sj = 5S1(71,...,7%) is the number of 1’s among r,...,7%, and sz = 
So(r1,..., 7k) is the number of 2’s among 71,...,7%. The factor 2~*! appears 


because the probability 2n~? is replaced by n~? in s; cases. The factor 2~*? re- 
flects the fact that permuting trials in which two identical edges occur results in 
the same graph. The lemma follows from the relations (3.4.5). B 


Theorem 3.4.6. Ifn,T — oo such that2T/n > 2, 0 < d <1, then for any 
fixedk =0,1,..., 


(D(a) /T — 


SEE OO +00), 


Pn, Tk) = 


where 


foe) 
D m 
D@)= >, a = de. 


m! 
m=1 
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Proof. The proof is similar to the proof of Theorem 1.8.2. We partition the sum 
from Lemma 3.4.5 into two parts. We put 


M=T'", 


It is clear that for any x in the domain of convergence of the series 


m! 
m=1 
we have 
Dy X™! ~~ © Dry e™* Diyyx™ 
1 mk k-1 m 
Dine Dat a OO ea 
m>M mi+e-tmg=m 1 ke m>M/ko Ot" 


(3.4.6) 


Along with the function D(x), let us introduce the generating function of the 
number of connected mappings 


d(x) = > cl 
Ey m! 
The inequality 
D(x) < d(x) (3.4.7) 


holds because 


m m 
Dea dp SY de =a, 
r=1 rol 


Also, 
m—1 we 
dm = (m — It YI = (em — Ihe”, 
k=0 
which implies 
~~ ~< ¥ em. (3.4.8) 
m! 
m>M/k m>M/k 
Let 
CO Un-lyn CO Unyn 
n?—*x n"x 
pata, arr sees ie 
r= ti 


By Example 1.3.2 and (1.4.8), 
d(x) =loga(x), a(x) = (1-—@(x)) I. 
We put a = 2T/n and x = ae~? fora < 1. Then 


6(x) =a, d(x) = —log (1 —a). 
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Under the hypothesis of the theorem, a = 2T7/n — A,0 < 4 < 1, and for 
x = ae~, there exists g < 1 such that ex = ae!~* < q < 1 for sufficiently large 
n. Therefore 


y ont <1 gut, (3.4.9) 
m>M/k a: 


Using estimates (1.8.8), (1.8.9), and (3.4.6)—(3.4.9) yields 


n m! Dm, «+» Ding 
s=gn(3) DD (thn Pm Po 
OkE : 
2 rs i hide m,!---m x! 
cT! (= a). > - nni(n = m)y-T-™)g,,, ve. Amy 
Seeeiea (n —m)!27T-™(T — m)!m,!---mg! 


CD a ca 


m>M mj+--+mg=m 


lA 


lA 


7 dAmnx™ cin 1/4 1/4 
snOQy Saw arg 
m>M/k : q 


1A 


where cj, C2 are constants. Thus, under the hypothesis of the theorem, S2 > 0. 
Ifn, T > 00, 2T/n > A, 0 < A <1, then by virtue of (1.8.7), 


T!(n —m)27-™), /T = 


Tym = x +0) 
n2t xm Be: 
= ead) 
2¢nm 


uniformly inm < M= T1/4. Therefore, for any fixed k = 1,2,..., 


n Dm, +++ D, 
! pam alt LS 
s=au(3) DY om (2) hn ee 


m<M mj+--+mgp=m 


m=k mj+---+mp=m 


By using the estimate of Sp, we obtain 


Dy, x™! © +» Dy x™* 
a a uaa by aaa er ae (1 +0(1)) +0 (1) 
m=k mj+---+mz=m 
JST — 


= Se * (Do) (1+o(1)). 
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Combining the estimates of S; and S2, we obtain, under the hypothesis of the 
theorem, 


Pn,T (k) = P{v(Gh,r) =k, Gn,T € Ant; Cn} 


(D(x). /T— a 
7 eK! 


Hence the assertion of Theorem 3.4.6 for k > 1 follows, since x = ae~? = 
(2T/n)e~27/" -5 ne* = w and D(x) > D(a). We use (1.8.6) and the repre- 
sentation from Lemma 3.4.5 and conclude that 


Pn,t (0) = V1 — AC. + 0 (1)). 


(1 +0 (1)). (3.4.10) 


Corollary 3.4.1. Ifn,T — oo such that 2T/n > id, 0 < A < 1, then the 
probability p,,r of the event that the graph G,,r contains no unmarked cycles 
Satisfies the relation 


Pn,r = eP/./7 2X1 +0 (1). 


Proof. We denote by p>, the probability of only marked cycles in the case where 

the graph has & unicyclic components and all the probabilities p, are equal to 1, 

r = 1,2,.... In this case, D(a) = d(a) = 2A = —log(1 — A), and Theorem 

3.4.6 gives 

Ake-A 
kt?’ 

To prove the corollary, it suffices to show that in the sum 


pork) > P20.2; 


n 
Pat = >_ Park), (3.4.11) 
k=0 
one can pass to the limit under the sum. Let us show that for any ¢ > 0, there exists 
K such that 


oe) 


So park) <8. (3.4.12) 
k=K+1 


We choose K such that 


k=K+1 


k= k=0 
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Then, for n > no, 


i 0 k,-A K K Ak,—A 

A“e A“e € 
1 1 

» Py rh) = > ae te Pr) ~ m=?’ 
k=K+1 k=K+1 . k=0 : 


and therefore 


oO 
> PP® <e. 
k=K+1 


Since py,r(k) < p\').(k), estimate (3.4.12) and the validity of passing to the 
limit under the sum are established. |_| 


Proof of Theorems 3.4.4 and 3.4.5. A cycle leads to the inconsistency of system 
(3.4.1) if the sum of the right-hand sides of the subsystem corresponding to the 
cycle is odd. Let p, be the probability that this sum is even for a cycle of length 
r. Then P,.7(k) = pn,r(k) for any k = 0, 1,.... Therefore Theorems 3.4.4 and 
3.4.5 are direct corollaries to Theorem 3.4.6 and the fact proved above that one 
can pass to the limit under the sum in (3.4.11). To prove Theorem 3.4.4, we notice 
that in this case, according to Lemma 3.4.1, 


Pr = (1+ A")/2, 
where A = 1 — 2>:; therefore 


> Dmx™ = 3 dO + A")x™ 


ree 7 (3.4.13) 
m!} 2m! 
m=1 m=1r=1 
1 apa 1 d® arx™ 1 
=2 mo 2 Deer ae S 7 d@) + da, A)), 
ca m=l1r=1 
where 
oo mr) 
dm ATx™ 
da =>) >. — 

m=l1r=1 


For x = ae~*,0 <a <1, 
d(x) = —log (1 — a), 
and 


d(x, A) = —log (1 — aA). 
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Indeed, 
oom (r) (Jar 
dx,a)= y= ATx™ eS Ax 
m=l1r=1 r=l1m=1 


[ooo ¢) = co 2 oo (ttr)t—!xt 
=p yee = yay yy aee 


By using the well-known equality 


x (t tr)! xt _@ 


! 
10 tt r 


from [124], Chapter 2, Problem 210 (see also [126]), we obtain 


ryt oar kasd 
d(x, A) = eee Z 


r=l r= 


_ 


We conclude by noting that fora = 2T/n > 4,0 <A < 1, 
d(x) > —log(i—A), d(x, A) = —log (1 — aA) —> log(1 —AA). 


Let us turn to the proof of Theorem 3.4.5. If m/T — 0, then for any fixed k, 
all the cycles are marked with probability tending to 1. Therefore 


Par (k) = Po +0 (1) = = (1 +0(1)). 


In the case where m/T — 1, we have p, — 0 for odd r and p, — 1 for even r 
by Lemma 3.4.3. Therefore, in this case, 


Dn — D®) = om di), 


l<r<m/2 
oo 0° (2) om 
= Dyx™ (2) im O@ 
Da) = DI > B@=) 
m=1 m=1 
It is not difficult to see that 
BO (a) = LS = —5 log (1 —a7). 


In the case where m/T — p,0 < p < 1, by Lemma 3.4.4, 


> (1+ A")/2, 
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and, as in (3.4.13), 


D(x) > D(a) = (d(@) + d(@, A))/2 = —5 log (1 — A)(1 — Ad). 


3.5. Hypercycles and consistency of systems of 
random equations 


In Section 3.2, we studied the rank of random matrices and found, in particular, that 
if the elements of a T x n matrix A = ||e;;|| are independent identically distributed 
random variables taking the values 0 and 1 with equal probabilities, then the rank 
r(A) of the matrix A has a threshold property: If T/n — a@ anda < 1, then 
P{r(A) = T} — 1, and if T/n > a@ anda > 1, then P{r(A) = n} > 1. 
In other words, the maximum number of independent critical sets s(A) tends in 
probability to zero in the former case and to infinity in the latter case. A similar 
property apparently holds for the sparse matrices considered in Section 3.3: We 
proved only that if @ < 1, then s(A) has in the limit a Poisson distribution, and 
Es(A) > oo fora > 1. 

In Section 3.4, we considered systems with at most two unknowns in each 
equation. It was shown that if T/n > a,0 < @ < 1/2, then the maximal 
number of independent critical sets or independent cycles in the corresponding 
graph approaches the Poisson distribution with parameter A = —3 log(1 — 2a). 
As follows from Theorem 2.1.6, if @ > 1/2, then s(A) tends in probability to 
infinity. 

The case of a matrix with independent and identically distributed random ele- 
ments taking the values 0 and 1 with probabilities 1/2 and the case of a matrix with 
at most two elements in each row studied in Section 3.4 can be considered as the 
extreme cases in terms of the behavior of the rank and the maximum number of 
independent critical sets. In these cases, the threshold effect appears at the points 
T/n = 1 and T/n = 1/2, respectively. 

In this section, we consider an intermediate case and obtain a weaker form of 
the threshold effect. We consider the system of random linear equations in GF(2): 


XQ te tu, =h, t= 1,...,T7, (3.5.1) 


where i1(t),...,i-(¢),¢ = 1,..., 7, are independent identically distributed ran- 
dom variables taking the values 1, ..., with equal probabilities, and the inde- 
pendent random variables b;,..., b;7 do not depend on the left-hand side of the 
system and take the values 0 and 1 with equal probabilities. If r = 2, we obtain 
the system considered in Section 3.4. 

In Section 3.1, we introduced the notions of critical sets for a matrix and hyper- 
cycles for the hypergraph corresponding to a matrix. Denote by A;,»,7 the matrix 
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of system (3.5.1) and by G,.,,7 the hypergraph with n vertices and T hyperedges 


é1,..., er that corresponds to this matrix. Thus we consider a random hypergraph 
Gr.n,r, Whose matrix A = Ayn,r = ||a;;|| has the following structure. The ele- 
ments of the matrix a,;;,f = 1,...,7, 7 = 1,...,, are random variables and the 


rows of the matrix are independent. There are r ones allocated to each row: Each 
1, independent of the others, is placed in each of n positions with probability 1/n, 
and a;; equals | if there are an odd number of 1’s in position j of row t. Therefore, 
there are no more than r ones in each row. 

For such regular hypergraphs, the following threshold property holds: Ifn, T > 
co such that T/n — a, then an abrupt change in the behavior of the rank of the 
matrix A,,r occurs while the parameter a passes the critical value a,. This 
property can be expressed in terms of the total number of hypercycles in G,.,7. 
Let s(A;,n,7) be the maximum number of independent critical sets of A,.n,7 or 
independent hypercycles of the hypergraph G,.n,7. Then 


S(Arn,r) = VAT) — | 


is the total number of critical sets or hypergraphs. 
In this section, we prove that the following threshold property is true for 
S(Ayn,7). 


Theorem 3.5.1. Let r > 3 be fixed, T,n — oo such that T/n — a. Then there 
exists a constant a, such that ES(A;.n,7) > O fora < a; and ES(A;n,7r) > © 
for a > ay. 

The constant at, is the first component of the vector that is the unique solution 
of the system of equations 


(3.5.2) 


>| & 
fn: 
Q 
™. 
cae | 
tay 
Nee” 
= 
i. | 
| 
= 


AtanhaA = x, 


with respect to the variables a, x, and x. 


The numerical solution of the system of equations gives us the following values 
of the critical constants: 


a3 = 0.8894..., a4 = 0.9671..., as = 0.9891..., 


ag = 0.9969..., a7 = 0.9986..., ag = 0.9995.... 
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Expanding the solution of the system into powers of e~” yields 


hs et et fr r 1 
ee ra ar eee 
which gives values close to the exact ones for r > 4. 

Let us give some auxiliary results that will be needed for the proof of Theo- 
rem 3.5.1. 

The total number of hypercycles S(A,,,,7) in the hypergraph G;,,,,7 with the 
matrix A,»,7 can be represented as a sum of indicators. Let &,,....;,, = 1 if the 
hypercycle C = {e;,,..., &,,} occurs in G,,,,7, and &, 4, = 0 otherwise. It is 
clear that P{é,,__:,, = 1} does not depend on the indices 1, ... , tm. Indeed, from 
the definition of the random hypergraph G,.n,7, the indicator &;,...;,, = 1 if and 
only if there are an even number of 1’s in each column of the submatrix consisting 
of the rows with indices ft), ... , tm. The number of 1’s inn columns of any m rows, 
before these numbers were reduced modulo 2, have the multinomial distribution 
with rm trials and n equiprobable outcomes. 

Denote by 71(s,7”),...,7n(s,m) the contents of the cells in the equiproba- 
ble scheme of allocating s particles into n cells. In these notations, the number 
of i’s in the columns of any m rows, before those numbers have been reduced 
modulo 2, have a distribution that coincides with the distribution of the variables 
ni(rm,n),..., (rm, n). Therefore 


P{Es, stm = 1} = P{m(rm,n) € E,...,mn(rm,n) € E}, 


where £ is the set of even numbers, and the average number of hypercycles in 
G,,n,7 can be written in the following form: 


T 


ES(A,.n,7) = > (7) Perm, n), (3.5.3) 


m=1 
where 
Pre(rm,n) = P{m(rm,n) € E,..., mn(rm,n) € E}. 
Thus, to estimate ES(4;n,r), we need to know the asymptotic behavior of 
Pr(rm,n). 


We consider a more general case and obtain the asymptotic behavior of the 
probabilities 


Pr(s,n) = P{m(s,n) € R,...,mn(s,n) € R}, 


where R is a subset of the set of all nonnegative integers. 

The joint distribution of the random variables 71(s,7),..., (s,m) can be 
expressed as aconditional distribution of independent random variables &, ..., &n, 
identically distributed by the Poisson law with an arbitrary parameter A, in the 
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following way (see, e.g., [90]). For any nonnegative integers 51, ..., 5, such that 
Sit--- +5, =S, 
P{n1(s,n) = 51,...,Mn(s,n) = Sn} 
= P{é = 51,...,&n = Sn | Ep +--+ +& =s}. 


Therefore 


Pr(s,n) = P{ni(s,n) € R,...,n(s,n) € R} 
_ Pf e R,...,6 ER, Ei +--- +8 =s} 
~ P+. +i =3) 
P(E) +--- +& =5|& €R,...,& € R} 
Pi t--+&=5) 


We now introduce independent identically distributed random variables 


e(®., é*) with the distribution 


= (P{& € R})” 


P(e” =k} = Pl =k | &1 € R), k=0,1,.... 
It is not difficult to see that 
Pep t---+in=s | €R,...,&1 © R}=PlE +--- +E" = 5}, 


and therefore 


P{éE(® ees e®) — = s} 
P{é +---+& =s} 


Let x = s/n and choose the parameter A of the Poisson distribution in such a way 
that 


Pr(s,n) = (P{é € R})” (3.5.4) 


x= eg(®) — 


keR 


Let d be the maximum span of the lattice on which the set R is situated and denote 
the lattice by Ip. 


Theorem 3.5.2. If s,n — oo such thatn € Tp, then in any interval of the form 
0<xj <x <x) < ©, 


fast + 0(1)) 


Pr(s,n) = (P{& € R})” = :) — 


A*e* 


uniformly in x = s/n, where the parameter i of the Poisson distribution of the 
random variable &, is the root of the equation x = Eg(%), and 0? = bei” (the 
variance). 
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Proof. The local limit theorem holds for the sum & ” Se g®), Following 
the classical proof of the local limit theorem of Gnedenko [49], we prove that if 
s,n — oo such that n € [p, then 


Piaf 4-4 500 =n} = 


d 
= (1+ 0(1)) 
uniformly in x = s/n in any interval of the form 0 < xg < x < x) < 00, where 
a= De®), and d is the span of the lattice Iz. 

When we substitute the expression into (3.5.4) and take into account that the 
sum &; +--- + &, is distributed by the Poisson law with parameter An, we obtain 
the assertion of the theorem. a 


Note that (3.5.4) implies the estimate 


ste 
Pr(s,n) < (P{& € RY" Op” 


where Pr(s, 2) does not depend on A, and on the right-hand side any positive value 
can be assigned to this parameter. Let EF = {0, 2, ...}. In this case, 


P{é, e E}= e~* cosh, 


and the estimate takes the form 


Pes.) < (cosh a)", (3.5.5) 
where A > 0 can be chosen arbitrarily. 
We now estimate 
= 17 
ES(A,n,7) =)» ( ) Perm). 
m=l sg 


Lemma 3.5.1. Ifr > 3 is fixed, and T,n — oo such thatT/n — a, then for any 
& > 0, there exists 5 > 0 such that 


Ss (7, ) Perm n) <eé. 
m 


l<m<6T 


Proof. First we point out that 
nok 


———_—_., =-0;.15-2...5 
(2k)! cosh dr 


P(e - 2k} = P{é, =2k|é € E} = 


Ee) — dtanha. 
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Put x = rm/n and choose the parameter A of the Poisson distribution in such a 


way that x = A tanhd. From (3.5.5), it follows that 
(rm)! 


rm yr” 


Pr(rm,n) < (cosha)” 


Since the value of x becomes small for sufficiently small 5, we can assume that 


X < 1 in the domain of summation. For such A, 

47/4 <x =Atanha < 22, 
and therefore 
Ax 


2 
coshaA <e* <e 


We now estimate the sum. It is easy to see that 


T T™ ! 
> ( ) Petrm,m < > —(coshay" om 
l<m<é6T ut l<m<6T hia aman 
< ys Te axn (Fm) 
= Arm yrm 
l<m<6T 


ve 


1<m<6T 


l<m<dbT 


2 m 
= Ds (()" ear) . 
n 


l<m<é6T 


r\" 1 (r-1) 
(ZV om (By 
z (Q*eey ey 


Since T/n tends to a constant, the last sum can be made arbitrarily small by 


choosing a sufficiently small 6. 


Lemina 3.5.2. If r is fixed, and T,n — oo such thatT/n > a,0 <a < 1, 


then for any ¢ > 0, there exists 5 > 0 such that 


» (7) Pr(rm,n) <e. 
m 


(1-8)T <m<T 


Proof. Put 4 = rm/n and let an integer mo be chosen such that mo/T < 5. With 


such a choice of A, by (3.5.5), 


T T 


T T . 
oS (7, )Petrm,n) < S (7, )ccosna) a 


m=T—mo m=T—mo 
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Since in the domain of summation, A is greater than some positive constant, there 
exists g < 1 such that 


e*cosha = (1 +e-)/2 <q. 
By using the inequality 
(rm)! < c(rmy™e7™ (rT)'/?, 


where c is a constant, we obtain 


T T 


Dy (7) Petem.m <e¢T)'? (Te cosh A)" 
m m 
m=T—mo m=T—mo 
T m T-m 
T\q"(1—4q) 
1/2, ,n 
< c(rT)’*q >. (7) gi (1 — q)mo 


IA 


i q n-T 
c(rT)!/ (hen 7 ton) : 


Since q, a < 1, the value mo/(n—T) can be made arbitrarily small by choosing 
a sufficiently small 6, and therefore the value g/(1 ~— q)”°/-7) can be made 
smaller than some Q < 1. Thus, for a sufficiently small 5, the right-hand side 
tends to zero under the conditions of the lemma. a 


Proof of Theorem 3.5.1. We now estimate the middle part of the sum. As T/n > 
a and 6 < m/T < 1 — 64, the values x = rm/n lie in an interval of the form 
0 < x9 <x <x; < 00. When we apply Theorem 3.5.2, we obtain for even rm, 
x \rm 2fX 
Pe(rm,n) = (Pie € By)" (=) ec + ot) 
he o 
uniformly in x, x9 < x < x1, wherex = Eg”) = Atanha, o? = be” — 
re +x—x?, 
From P{é; € E} = e~*coshA, we obtain the final estimate: As T,n — 00, 
T/n—> a, 


x )" 2./x 


Pg(rm,n) = (cosha)" (— (1 + 0(1)) (3.5.6) 


uniformly inm,d <m/T <1-6. 
Setting p = m/T,q = 1 — p, and using the normal approximation to the 
binomial distribution show that, as T — ov, 


T ve m .T—m,,m ,T—m,\-1 1 
= = ——____——_-(1 + o(1 
(7) (he q (pg) pqrm aT pa o(1)) 
uniformly inm,5<m/T <1-6. 
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Let a = T/n and write p = m/T in terms of x = rm/n and a. Then 


_m x _;_m_arax 
PSF Gar q= Tar’ 


and the estimate of (7) takes the following form. As T > 00,5 < m/T < 1-86, 


ar _ yy \n/r 
@) oe (eer) (1 +0(1)) (3.5.7) 


m JY2nx(ar — x)an x*(ar — x)@" 
uniformly in m. 
We combine the estimates (3.5.6) and (3.5.7) and obtain 


T 2. 
(7, )Petrm. n) = (fla,x)"= arw® 44 4 Ga); 


J2nmx(ar —x)an 


where 


x \x far —x\*/" ar \?° 
f(a,x) = cosh (=) ( - ) (“.) ; 
x = (tanha. 


The function f(a, x) increases as a increases, 
of l/r 
f(a, x) = f(a, x) log (; (* *) >-o, x>0, 
x 


and the derivative f%.(a, x) has no more than two zeros. Therefore the system of 
equations 


f(a,x) = 1, 
if Ga-4) = 0, (3.5.8) 
Atanha = x 


has the unique solution (a, x,, A,-); at this point, the function f(@,, x) asa function 
of x attains its maximum, which is equal to 1. Therefore, for all x, 0 < x < ar, 


I(r, x) S f(Gr, xr) = 1. 
In addition, 
S(a,x) < f(@,,x) <1, a<a,, 
f(a, x) > f(@r,x)=1, a> ay. 


This implies that the middle part of the sum tends to zero for a < a, and tends to 
infinity for a > a. 
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If we consider the estimates for the tails of the sum in Lemmas 3.5.1 and 3.5.2, 
we obtain the assertion of Theorem 3.5.1 because system (3.5.8) can be easily 
transformed to the form mentioned in the statement of the theorem. a 


It would be interesting to find the limit distribution of the number of hypercycles. 
Up to now, no one has succeeded even in proving that S(A,.,,,7) tends in probability 
to infinity as T,n > 00, T/n > a> a,. 


3.6. Reconstructing the true solution 


We consider the system of equations in GF(2): 
iy tx =b, t=l,...,T, (3.6.1) 


where the pairs (i(t), j(t)), ¢ = 1,..., 7, are independent identically distributed 
two-dimensional random vectors that take values (i, 7), i < j,i, 7 = 1,...,”, 
with equal probabilities (") 

In Section 3.1, we interpreted a system similar to (3.6.1) as the result of T trials 
performed with the aim of classifying n objects by random pairwise comparisons, 
and we set b, = 0 if the comparison of x;(4) and x;(4) showed that these objects 
were from the same class, and 6; = | otherwise, for ¢ = 1,..., 7. If the compar- 
isons are not absolutely right, then the result of a comparison may deviate from 
the true value. Suppose that X* = (xf, ..., x7) is the vector of true values of the 
unknowns, and the column-vector B* = (67, ..., 67) is obtained by substituting 
X™* into the left-hand side of system (3.6.1): 


AX* = B*, (3.6.2) 


where A is the matrix of system (3.6.1). 
If the measurements are not precise, then it is natural to suppose that 


by = OF + &, PH 1 a Ls 


where €1,...,&€7 are independent identically distributed random variables that 
do not depend on A and take the values 0 and 1. These random variables can be 
interpreted as errors. Let 


1-A 1+A 


where A is called the excess. 
The problem is to estimate or reconstruct the vector X* = (X}*,..., x7) on the 
basis of the matrix A and the right-hand side B = (51, ..., by) of system (3.6.1). 
In a similar situation over the field of real numbers, an estimate of the true 
solution of a system of linear equations with perturbed right-hand sides can be 
found by the least-square method. Under some conditions on the matrix and the 
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errors in the right-hand sides, the least-square method provides an estimate that 
converges to the true solution as the number of equations tends to infinity. In 
contrast to the field of real numbers, in GF(2) a good estimate X = (81,.--, En) 
coincides with the true solution X* = (xj, ..., X,") with probability tending to 1 
as T —> oo. 

As usual, we associate the graph [7,7 with the left-hand side of system (3.6.1). 
The graph [7,7 has n labeled vertices corresponding to the unknowns x1, ..., Xn 
and T edges e; = (i(t), j(¢)),t = 1,..., T. Theedgese), ..., er are independent 
and assume the n(n — 1)/2 possible values with equal probability. Therefore, the 
graph I,,7 may have multiple edges. 

It is clear that along with the vector X*, the vector X¥* = (xf, ...,X;) with 
elements x* = xf + 1,¢ = 1,...,n, satisfies the system (3.6.2). The pair X*, 
X* is uniquely determined by the system (3.6.2) if the graph T,,7 is connected, in 
other words, if the system (3.6.2) contains all the unknowns and is not decomposed 
into subsystems with disjoint sets of unknowns. 

Denote by p,,7 the probability that the graph I, 7 is connected. It follows from 
Theorem 2.3.8 that ifn, T — oo such that T = nlogn+an+o(n), where a isa 
constant, then 

Pn,T > ee" 
Thus, ifn, T — oo and the pair X*, X* is determined by the system (3.6.2) with 
probability tending to 1, then 


Wn 


nlogn — logn’ 


where Wy, — 00. 

In this section, we present three algorithms for reconstructing the true solution 
of system (3.6.1) with perturbed right-hand sides. We first describe the reconstruc- 
tion method that can be called the voting algorithm. This algorithm consists of 
correcting the right-hand sides b),..., b7 of the system (3.6.1) by the majority 
rule. Let the system (3.6.1) contain the subsystem with m;;,i < j, equations: 

xjtx = a ) F 
(3.6.4) 
a, 


xtxy = ij 


The true value of a ae ae? 


We set aj; = 1if 


equals a7, = x7 + x7. 


(mij) 


ay teraz > mi;/2, 


and aj; = 0 otherwise. 
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Under some conditions, system (3.6.1) is indecomposable and a;; = ai, for all 
i, j = 1,..., 1; thus the true solution is reconstructed. 

Denote by P(n, T) the probability of reconstructing the true solution of system 
(3.6.1) by the voting algorithm, that is, 


P(n, T) = P{aij = aj, if SApecagnt: 


Theorem 3.6.1. Ifn,T — co and A — 0 such that 
A2T 
> 
n2logn 


’ 


then P(n, T) > 1. 


Proof. Let 


win, T) = min mij, 
1<i,j<n 


where the minimum is over all subsystems of the form (3.6.4). It is clear that 
P(n, T) = Plu(n, T) > m)Pm(n, T) + Plu, T) <m)Pn(n,T), (3.6.5) 


where P,,(n,T) and P,(n, T) are the conditional probabilities of reconstruct- 
ing the true solution under the conditions {u(n, 7) > m)} and {u(n,T) < m)}, 
respectively. 

We obtain a rough estimate for the probability P{u(n, T) > m). It is clear that 
P{u(n, T) > m) is the probability that each cell contains more than m particles 
in the classical scheme of allocating T particles into (5) cells. Denote by 7; the 
number of particles in the ith cell and put &; = 1 if n; < m, and é; = Oif n; > m, 
P= Tysney'(G): BY 1D; 


P{u(n, T) < m) = P{& +---+&y > 0} 


< (Ses = (5)Pin < m). 


The random variable 7; has the binomial distribution with T trials and the 
probability of success (a ae Since a = En; = T/(5) > 00, the normal ap- 
proximation is valid for this distribution. We choose m = a(1 — A), assume that 
(A./a)?/T — 0, and estimate the probability P{n, < m)}. By taking into account 


the choice of m and the equality Dn, = a(1 + o(1)), we obtain 


P{m <n} = P{(m —@)//Dm < (m — @)//Dm} 
= P{(m —«)//Dm < AVa(l + 0(1))} 


—AJa 
= — | e~” 2 du(l + 0(1)). 
v —00 
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Hence, there exists a constant c such that 


P{m <m) <ce@/?, 
Thus, form = a(1 — A), 
P{u(,T) <m}> 0 (3.6.6) 
because A?a/logn > 00 and n2e~4#/2 + 0. 


Now we have to show that under the conditions of the theorem, P,,(n, T) > 1. 
In other words, we have to prove that aj; = a}, for all i, 7 = 1,...,n with 
probability tending to 1. The additional requirement of the indecomposability of 
the system (3.6.1) or of the connectedness of the graph [}, ,7 is obviously fulfilled. 

Recall that b; = b* + €;. We may assume that in the subsystem (3.6.4), 


go = 


&) = - 
a; = ij t+ Ei; k=1,...,mij, 


where the random variables an... ; wen > are independent and have the same 


distribution as €,..., 7 froin the iehenanad side of (3.6.1). Denote by E(n, T) 
the number of wrong deeicionk that is, the number of realized events {a;; 4 a} ao 
i,j = 1,....n. Now let &; = Life) +--+ ef)" > mi;/2, and & = 0 
otherwise. It is clear that the number of ¥ wrong decisions can be represented in the 
form 


&(n,T) =) ij, 
i<j 


and 


1— Pur(n, T) = P{E(n,T) > 0| wm, T) > m} 


< (Seen | u(n, T) > m) 


n 
= (5)Pten =1[uMm,T) > m}. (3.6.7) 
Now we derive estimates for 


PlEi2 =1| w(n, T) > m) = Pfety +--- +.e]9!? > mi2/2 | wn, T) > m}. 
The random variables ee wk eee are independent and have the same distribu- 
tion as the random variables ¢), ..., ¢7 from the right-hand side of system (3.6.1). 
We set Sp = €; +--- +e, and inate 


P{S > k/2} = P{S; — ESp > kA/2}. 


Here, and later in this section, we use the following inequality of exponential type 
for the sum 5S; that was proposed by Hoeffding [59] and can be found in [122] (see 
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Theorem 1.1.16). For any positive A, 
P(S; — ES; > kA/2} < 7 #4/2, (3.6.8) 
Therefore 
P{e ete +e, mis) > mij/2| wn, T) > m} <e™*??, 


and from (3.6.7), we obtain 
n —mA?/2 
1— Pu(n,T) < (Be : (3.6.9) 


For m = a(1 — A), a = T/(5), under the conditions of Theorem 3.6.1, the right- 
hand side of (3.6.9) tends to zero. Thus, the assertion of the theorem now follows 
from (3.6.5), (3.6.6), and (3.6.9). a 


We now describe the second algorithm for reconstructing the true solution of 
system (3.6.1), which can be called the peed of coordinate testing. 

We choose a vector ¥ = = , ee 2) ) by random sampling from the 
set of all n-dimensional vectors over GF(2). Denote by BO = (0)... oe) 
the column-vector obtained by substituting X for X in the left-hand side of 
(3.6.1). Let B(X¥) be the number of the coordinates of B® that coincide with 
the corresponding coordinates of the vector B = (b),..., br) of a right-hand 
sides of system (3.6.1). We construct a vector X‘) = ox, 5 00P) from XO 
and system (3.6.1) and show that, with probability tending to 1, the vector X“)) 
coincides with the true solution X*. 

Therefore we consider the vectors 


0 0 0 

Xio = (x‘ re 0k 0; Pan , oe 
0 0 0 

5 ie Cy gereete see G eee) 


and calculate the values 6(X;,0) and 6(X;,1), defined for the vectors Xj,9 and X;,1 
in the same way B(X) was defined for X. 
Fori = 1,..., 7, let 


(1) | 0 if B(Xio) = B(Xi,1), 
Xj = 
1 if B(Xi,0) < B(%,1). 


Denote by €(X) the number of coordinates of the vectors X and X* that coin- 
cide. The value 


n(X) = max (€(X), €(X)), 


where X = (X1,...,%n) = (1 +1,...,%n + 1), is called the number of coinci- 
dences. 
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Lemma 3.6.1. If n — ov, then the distribution of the random variable 
(2n(X) — n)/./n converges weakly to the distribution of the modulus of the 
random variable that has the normal distribution with parameters (0, 1). 


Proof. Since the vector Y is chosen from the set of all n-dimensional vectors 
by random sampling with equal probabilities, the random variable S, = €(X) 
has the binomial distribution with parameters (n, 1/2). From the obvious equality 
&(X) + &(X) =n, the random variable n(X) is represented in the form 


n(X) = max(S,,n — Sy). 


It is clear that 


S(2n(x) —n)= max ( 


28, —n ==) 


n Jn’ ln 
1 
= Trig = n|, 
and the assertion of Lemma 3.6.1 follows from the convergence of the distribution 
of (2S, —n)/./n to the normal distribution with parameter (0, 1). a 


We can now prove the following assertion concerning the algorithm of coordi- 
nate testing. 


Theorem 3.6.2. Ifn, T — oo and A —> 0 such that 
A?T 1m, 
=—- > 
n2logn 
then 


P{X® = xX*} 51. 


Proof. For definiteness, assume 
§(X) > &(X). 


The coordinates of XY that coincide with the corresponding coordinates of 
X* are called true, whereas those that do not coincide are called wrong. 

For the algorithm of coordinate testing to lead to the true solution, the following 
obvious conditions must be fulfilled. For each coordinate of the vector Y, the 
value of B(X) must increase if we replace the wrong value of the coordinate by 
the true value, and the value of B(X) must strictly decrease if we replace the 
true value by the wrong one. 

We separate all the equations of the system (3.6.1) that contain x;, and denote the 
number of such equations by n;. Replacing Pe by a changes the contribution 
in B(X) of these equations only, and each equation containing x; contributes 
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1 or —1. If — is wrong, then the increment of B(X) due to replacing x by 

¥ is equal to the random variable B;(X) such that (B;(X) + n;)/2 has the 

binomial distribution with parameters (n;, p;), where p; is the probability that the 

coincidence in a fixed equation containing x; appears after substituting as for 
(0) provided 5 is wrong. It is not difficult to see that 


Xj> 


Be=vgt(l—v)p, (3.6.10) 


where q = P{b; = b¥}, p = 1—q, and vis the probability that the second variable 
in the equation has the true value. The second variable takes values from the set 


{xO, sacs xO) \ (0) 


with equal probabilities. Therefore v = (k — 1)/(n — 1), where k is the num- 
ber of true coordinates of X, which equals £(X)) under the assumption that 
&(X) > &(X). It follows from Lemma 3.6.1 and equality (3.6.10) that 


(1 + A)k (1 k > 


eS Dn peg 22 
1, (26(X)-n+1)Aa 
mage T= 1) ; 


which we write as 


1 
=F , 3.6.11 
coat ar) aE ( ) 
where 
_ 26(X) =n +1) Va 
n— n— 1 . 
By assumption, &, is asymptotically normal with parameters (0, 1). 
Therefore 
P{\E,| > (A?7/(n7 logn)) “/*} > 1 (3.6.12) 


because A?T/(n? logn) — oo. 

Next, we find a lower bound for n;, i = 1,...,”. To this end, we take into 
account only the first variable in each equation. Then we obtain the classical 
scheme of equiprobable allocation of T particles into n cells, and by applying the 
corresponding results on the distribution of the minimum of contents of cells [90], 
we find that 


P{ min n; > T/Qn)| +1. (3.6.13) 
l<i<n 
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For the increments &;(X), we have 
P{é (xX) <0| 4 is wrong} 
= P{(E(X) +:n:)/2 < nj/2 | x© is wrong} 
= P{S,, < ni/2}, 


where S,, has the binomial distribution with parameters (n;, p;). From (3.6.11), 
we find that 


P{S,, <n;/2} = P{S,, — ES,, < Al&i|ni/(2/n)}. 


When we use estimate (3.6.8) of the exponential type for the binomial distribution 
and take into account (3.6.12) and (3.6.13), we obtain 


a eat le ea 
P{S,, < nj /2} < oP |- n2 (sims) 


In a similar way, we obtain the bound 


2 2 -1/2 
fy) ©: _AT ( AT 
P{&(X) > 0| x; is true} < 0 | =o (+ er : 


Therefore an upper estimate for the probability of at least one wrong decision 


while testing all the coordinates of the vector X¥ is 


oe (P{é(X) <0] as is wrong} + P{é,(x) >0| a”) is true}) 


i=] 


AT ( att \~? 
ce n? \n?logn 


and tends to zero under the conditions of the theorem because 


2 2 

A*T/(n* logn) > ov. » 

With the help of a preliminary search of the n-dimensional vectors, it is possible 
to select an initial vector ¥ witha great number n(X ) of coordinates coincid- 
ing with the corresponding coordinates of the true solution X*. If the algorithm 
for coordinate testing begins with this initial vector, then a much smaller number 
of equations is needed to reconstruct the true solution. This number is comparable 
to the number of edges needed for the graph I,,7 to be connected. 


Theorem 3.6.3. Ifn, T — co and A — 0 such that 
APF 

> 
nlogn 


> 
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then there exists an algorithm that reconstructs the true solution of system (3.6.1) 
with probability tending to 1. 


Proof. The algorithm, which gives the true solution under the conditions of the 
theorem, begins with a preliminary search of an initial vector X with a large 
number of coincidences with the true vector X*. The choice of X is determined 
by a search of all n-dimensional vectors. To this end, we choose the level / = 
Tq —urVT, where g = P{b, = b*} = (1+A)/2andur = AVT/18, and select 
the vectors X for which B(X) > /. Recall that B(X) is the number of coincident 
coordinates of the vector B = (bi, ..., by) and the vector of the right-hand sides 
of system (3.6.1) that are obtained when_X is substituted into the left-hand side of 
the system. 
The vector X* will be selected with probability tending to 1. Indeed, 


P{p(X*) < Tq = urVT} = P{Sr = ES; < -urVT}, 


where Sr is the number of successes in T independent trials with the probability 
of success equal to gq = (1 + A)/2. By using estimate (3.6.8), we find that 


P(B(X*) <I} <e“7?, 


and the complementary probability P{6(X*) > 1} — 1 because ur — oo. 
If €(X) = s, then the probability of the coincidence of a fixed component of 
the right-hand sides is 


gate) q(n—s)(n—s—1)  2(1—q)(—s) 
ESE n(n — 1) n(n —1) n(n —1) : 


and, since gq = (1 + A)/2, we find 


_t A(2s —n)(2s —n+ 1) 
BS) = 5 * 2n(n — 1) 


For example, let s < 2n/3. Then p(s) < 1/2+ A/9, beginning with some n, and 
for any fixed X with €(X) =s < 2n/3, 


P{A(X) > 1} = P{Sp > Tq —urVT} 
P{ Sp — ESp > 7AT/18 —urVT} 
P{Sr — ESy > AT/3}, 


tA 


where Sr is the number of successes in T independent trials with probability p(s) 
of success. 
By using the inequality (3.6.8) of exponential type, we find 


P(B(X) >I) < eo 87/8, 
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The probability that none of the vectors X with €(.X) < 2n/3 will be selected 
does not exceed 2"e~4°7/18, and under the conditions of the theorem this prob- 
ability tends to zero. Thus, with the help of the exhaustive search, it is possible 
to select, with probability tending to 1, a vector X such that €(¥) > 2n/3. 
Beginning the algorithm for coordinate testing with this vector X, we find, using 
the notations introduced in the proof of Theorem 3.6.2, that 


P{é;(X) <0 | x, is wrong} = P{S,, — ES;, < —AlEnlni/(2Vn)}. 


Using estimate (3.6.8) and taking into account that with probability tending to 1, 
\En| > ./n/3 for the selected vector and n; > T/(2n), we find the estimate 


P{é; (x) <0| =) is wrong} < P{S,, — ES,, < —Anj/6} 


2 
e7A°T/B6n)_ 


Similarly we obtain 
P{é(X) > 0| x is true} < eo A°T/GB6n) | 


As in the proof of Theorem 3.6.2, an upper bound for the probability of at least 
one wrong decision, while all n coordinates of X are tested, is 2”e~4°7/G6") 
and tends to zero under the conditions of the theorem. a 


Thus, if we use the exhaustive search, then the true solution can be reconstructed 
under the condition A?7/(nlogn) > oo. If the number of equations T is such 
that A?7/(n? logn) —> 00, then the reconstruction can be realized by the voting 
algorithm, which is more economical with respect to the number of operations. 
Clearly, there is considerable interest in the algorithms that lead to the true solution 
with probability tending to 1 under intermediate conditions on the number of 
equations T and do not require the exhaustive search of all 2” vectors. 

Let us describe an algorithm that will be referred to as Az. Consider all (3) 
equations obtained as the pairwise unions of the equations of the system (3.6.1). 
Among the equations obtained by this operation, there are equations that contain 
either four, or two, or zero unknowns each. Denote by Sz the subsystem that 
includes all the equations with two unknowns each. The algorithm A2 ends with 
the application of the voting algorithm to the subsystem S2. The following theorem 
gives the conditions under which the algorithm A> reconstructs the true solution. 


Theorem 3.6.4. Ifn, T — oo and A > 0 such that 


AtT2 
n logn 


> w, 


then the algorithm Az reconstructs the true solution with probability tending to 1. 
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Proof. Let i and j be arbitrary, assume i < /, and consider all equations of the 
system S2 of the form 


a) 
by, 


Xitxy 
(3.6.14) 


xitxjy = ae 


The equality m;; = m means that the graph I), ,7 corresponding to system 
(3.6.1) contains exactly m vertices, say v1,...,Um, such that the graph Tj, 7 
contains the edges (vj, i), (v1, /),.--, (Umi), (Um, j). The right-hand sides 
bP ae be ) are the pairwise sums of 2m;; independent random variables, and 
therefore they are independent and, according to Lemma 3.2.1, take the true value 
b¥, = xj + x7 with probability (1 + A?) /2 and the wrong value with probability 
(1 — A?)/2. 

Let bij = 1 if BO +--+)" > mjj/2, and bij = 0 otherwise. As in 
the proof of Theorem 3.6.1, we denote by y(n, T) the minimum value of mj; 
over all subsystems of the form (3.6.14). As in (3.6.5), the probability P(n, T) of 
reconstructing the true solution can be represented in the form 


P(n, T) = P{u(n, T) > m}Pn(n, T) + P{u(n, T) < m}Pn(n,T), (3.6.15) 


where P,,(n, T) and P»,(n, T) are the conditional probabilities of reconstructing 
the true solution by the majority method under the condition that {u(n, T) > m} 
and {u(n, T) < m}, respectively. 

As in the proof of Theorem 3.6.1, we need to estimate P{u(n, T) > m}, but 
here this estimation is more laborious. 

Let &; = 1 if mij < m, and &; = Oif mj; > m,i < j,i, j =1,...,n.Itis 
clear that 


P{u(n, T) <m) =P} &j -o| < (5)Ptmi2 < mi (3.6.16) 


i<j 


Let pw; = 1 if the edges (i + 2, 1) and @ + 2, 2) occur in T,,7, and pw; = 0 
otherwise; and v; = 1 if exactly one of the edges (i + 2, 1), @ + 2, 2) occurs in 
Tr, and v; = 0 if the edges (i + 2, 1) and (i + 2, 2) do not occur in I),,7. The 
random variable m2 can be represented as the following sum of indicators: 


m2 = bi +-++ + hn-2, 
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and 


m 


Pimi2<m}= >> DO) dinigs (3.6.17) 


k=0 {1),...,ix¢} 


where pj;,...i, is the probability that j;,,..., 4;, take the value 1 and all the other 
random variables take the value 0. 
It is not difficult to see that (M;, N;), where 


M, = bit-:>+ ue, Ne =vypte+- +, 


is a Markov chain because (4741, ¥41) depends only on the number of edges used 
to construct the random variables 441, ..., (47, ¥1,...,¥%,f=1,...,n —2. More 
precisely, let 


P(t | Yr-1, Z:-1) = Plus = 1 | Mr-1 = Yr-1, Mt-1 = Zr-1}, 


a(t | Yr-1, Z;-1) = Plus =0| Mp-1 = Yr-1, Np-1 = Z;-1}. 


By using this notation, we can write the probability p;,_;, that ;,,..., Wi, take 
the value 1 and all the other random variables take the value 0 in the form 


Piy...ip = P{ui, = 1, sees hig = 1, 
Ki = 0, ifil,...,ix| v1 = Z1,.-.,Vn—2 = Zn—2} 
= q(l | Yo, Zo) ee qi -1 | Yj,-2, Zi,-2) pi | Yi,-1, Ziy-1) 
xq +1 | ¥i,,2Zi,)---¢(2 —2 | Yn-3, Zn-3), 

where Zo = Yo = 0, Z; = 23 +---+ z, and Y; is the number of ij, ..., i, that 
do not exceed f. 

We now estimate the probabilities p(t | Y, Z) and q(t | Y, Z). It is clear that 
ptt | Y,Z)+4q(t | Y, Z) = 1, and the probability p(t | Y, Z) does not depend on 
t and equals the probability p2(s, N) that two fixed places corresponding to the 


edge (1, ¢), (2, £) will be occupied after allocating s = T — 2Y — Z edges into 
N = (5) — Z places in the classical scheme of allocation of particles. Therefore 


is s! en: as 
PSN) = me aa { -5) 


== k+l 
ery en RIN(s —k-D!IN N 


176 Systems of random linear equations in GF(2) 


and we have the following estimates: 


s—2 
pteny =o? (1 -5) 


N2 N 
s(s —1) g Na 
< eee 
prs, N) < 5 (1 x) 


s(s — 1) (s —2)! 2 \s-#-I-2 
Se ee) 
N24 RIN (sk —1—- 2A N 
_ s(s—1) B\i 2 6g 4) Dh ike 
= 59 (1-2) 2659 (1-(0-2)"). 


T-—3n<s=T-Y-Z<T, 


Since 


n(n —3)/2<N= (°) —Z <n(n—1)/2, 
we obtain for all kK = 0, 1,...,n —2, 
Bier oc. 

where 

P = max p(t | Y:-1, Z+-1), 

Q = 1—min p(t | Yi-1, Z;-1). 

Therefore it follows from (3.6.16) and (3.6.17) that 
P(mi2 <m} < (P+ Q)" P(E + --- + &n-2 Sm}, 
where &1, ... , &:—2 are independent identically distributed random variables, 
P(E. =1}= P/(P + Q), P(E, = 0} = O/(P + Q). 

Asn, T — coand T/(3) > 0, 


P ple 
P+Q_ nt 


and under the conditions of the theorem, 


1+ O(T/n’)), 


(P + Q)"-? =1+0(1). (3.6.18) 


The random variable ¢,-2 = &| +--- + & —2 has the binomial distribution with 
parameters (n — 2, P/(P + Q)). 
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Let w = Et,_2 = (n—2)P/(P + Q) andm = a(1— A’). We assume that T 
is not too large, so that AST? {n'9/ 3 _, 0. Then, for sufficiently large n, 
P{t,-2 < m} < P{(Gn-2 — Etn—2)//Din-2 < —A*Je:/2(1 + 0(1))} 
1 —A? fa /2 
~ V20 [. 


and there exists a constant c such that 


eo /2 du(1 + 0()), 


P{t,-2 < m) <ce~*2/8, (3.6.19) 
Thus, by virtue of (3.6.16), (3.6.18), and (3.6.19), 


P{u(n, T) < m} < cn2e4'2/8 _, 9 (3.6.20) 


because, under the conditions of the theorem, A472 / (n3 logn) — oo, and conse- 
quently, n2e—4*2/8 _, 9, 

As in the proof of Theorem 3.6.1, we have to show that under the conditions 
of Theorem 3.6.4, the system S is indecomposable and P(n, T) — 1. In other 
words, we have to show that bj; = bj, for all 7, j = 1,..., with probability 
tending to 1. 

By the same reasoning as in the proof of Theorem 3.6.1, form = a(1 — A?), 
we obtain 


1 — Pn(n, T) < ja (3.6.21) 


and under the conditions of Theorem 3.6.4, the right-hand side of (3.6.21) tends 
to zero. 
The assertion of the theorem follows from (3.6.15), (3.6.20), and (3.6.21). 1 


3.7. Notes and references 


The theory of systems of random equations in finite fields was developed by the 
Russian mathematicians V. E. Stepanov, G. V. Balakin, I. N. Kovalenko, A. A. Lev- 
itskaya, and others. The connection between systems of equations in GF(2) and 
graphs was first pointed out and used by Stepanov. The notion of a critical set was 
introduced in [79] (see also [13] and [85]). 

The theory of recurring sequences and shift registers mentioned in Section 3.1 
can be found in [50] and [156]. 

Theorems 3.2.1 and 3.2.2 were proved by Kovalenko in [92]. This brilliant 
result initiated a series of investigations of similar problems that were carried out 
by Kovalenko and his school. These investigations developed in two directions. 
The first direction concerns extensions of Theorems 3.2.1 and 3.2.2 to matrices 
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over more general algebraic structures. It is not difficult to see that by virtue of 
the Markovian character of the process p, (t), a recurrence relation for p, 7 (k) = 
P{o,(T) = k} can be derived and used for the proof of Theorem 3.2.1. In this way, 
the extension of the result to a finite field with g elements can be easily obtained 
[93]. Let the elements of T x n matrix A = |lq;,;|| in GF(q) take the values 
0,1,...,q — 1 with equal probabilities, then the p,,7(k) for any k = 0,1,... 
satisfy the equation 


Pn,T (k) = 2” pn, -1(k) + (1 — 2") pn—1,7-1 8), (3.7.1) 


where z = 1/q. Indeed, if the first row of A is a zero vector, then o,(T) = 
n(T — 1), and if the row contains at least one nonzero element, then o,(T) = 
Pn-1(T — 1) + 1. It follows from (3.7.1) that if s > 0 and m are fixed integers, 
m+s>0,n > o,andT =n+™m, then 


[oe m+s -] 
P(on(T) =n—s}> qs TT (1 = =) I] (1 = ~) Mane Ce) 


i=s+l1 i=l 


The investigations in the second direction concern the bounds of invariance 
of the results of Theorems 3.2.1 and 3.2.2 with respect to the deviations of the 
distribution of elements of the matrix A from the equiprobable distribution. The 
problem of the invariance and a proof of Theorem 3.2.3 are given in [91, 92]. A 
modified proof of Theorem 3.2.3 is contained in [93]. 

Theorem 3.2.4 can be easily extended to any moment of a fixed order of the 
number of solutions, but that is not sufficient for the proof of the invariance property, 
since the limit distribution (3.7.2) does not satisfy the sufficient conditions of the 
unique reconstruction by its moments; hence, Theorem 1.1.3 cannot be applied. 

Levitskaya [96, 97] presents results on the number of solutions of linear random 
systems over arbitrary rings and the corresponding results on the invariance of the 
moment and the limit distributions. These results are summarized in [93], where, in 
particular, the exact bounds for the invariance are given for random linear systems 
in arbitrary finite rings. For the system considered in Theorem 3.2.3, the exact 
bounds for Oe have the form 

bn < pl < 1-4, 


where 6, = (logn + x,)/n and x, — oo arbitrarily slowly as n > oo. 

Matrices that satisfy condition (3.3.1) were considered by Balakin [12], who 
also proved Theorems 3.3.1 and 3.3.2. Closer investigation of the estimates used 
in our proof of Theorem 3.3.1 allows us to obtain the following assertions. 


Theorem 3.7.1. Ifn — o, 


T =n+ By logan, 
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Bn > —0©, By = o(n/logn), and condition (3.3.1) holds, then the distribution 


of s(A) converges to the Poisson distribution with parameter e~~. 


Theorem 3.7.2. Ifn > ow, 
T =n-+ Blogn + o(logn), 


B is a constant, and condition (3.3.1) holds, then the distribution of s(A) converges 
to the Poisson distribution with parameter e* if B < 0, and with parameter 
e~*-B if B > 0. 


Theorems 3.3.1, 3.3.2, 3.7.1, and 3.7.2 give a complete description of the be- 
havior of the rank of such matrices, except for the case 6B = 0, where the behavior 
is unknown. Note that in [12], the analogues of Theorems 3.3.1, 3.7.1, and 3.7.2 
are proved for the systems over GF(q), g > 2 (see also [86]), and the connection 
between the rank of a matrix in GF(q) and other characteristics such as the perma- 
nent rank and rank of lines is considered. The initial results on the ranks of random 
matrices are presented in [38] and [11]. 

Stepanov began investigating systems of linear equations of the form (3.4.1) 
with the help of their relations to random graphs. In particular, he proved The- 
orems 3.4.1 and 3.4.2. Now the theory of random graphs provides a basis for 
obtaining the results on the systems of random equations with coefficients taking 
their values with equal probabilities. If the coefficients of a system are essentially 
nonequiprobable, then there are no standard approaches to investigating its prop- 
erties. Only a few results are known for such systems. We remark that at this time, 
graph theory is not sufficiently developed to answer questions about nonequiprob- 
able cases. Only the method of moments (see Theorem 1.1.3) and the so-called 
direct methods are used to solve these problems. Theorem 3.4.3 is a corollary to 
Theorem 2.4.1 proved in [88] by the method of moments. 

Theorems 3.4.4 and 3.4.5 are proved in [83]. The asymptotics of the probabil- 
ity of consistency of a system of linear equations in GF(2) (and in more general 
algebraic structures) with independent random coefficients that take the values 0 
and 1 with equal probabilities have been obtained by Levitskaya [98] (see also 
[93]). This probability takes only two values and is the same for all possible 
right-hand sides of the system that are not the zero vector. It follows from Theo- 
rems 3.4.4 and 3.4.5 that the probability of consistency of the system (3.4.1) de- 
pends on the number of 1’s in the vector of the right-hand sides of the system (see 
also [83]). 

The results of Section 3.5 on the behavior of the probability of consistency of 
the system (3.5.1) can be found in [13] (see also [85]). Theorem 3.5.1 is proved by 
the author, but the critical values a, were first obtained by Balakin under slightly 
different assumptions on the matrix A,,,,7. These results are extended to GF(q) 
in [89]. The proof of Theorem 3.5.2 is given in [87]. 
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We can consider the probability of the consistency of a system from the point 
of view of mathematical statistics. Consider, for example, the system (3.4.1) and 
assume the following two hypotheses on the distribution of the right-hand sides of 
the system. Let the hypothesis Hp be the existence of a vector X* = (xf,....%7); 
which is interpreted as the true solution of the system, and by = xj.) + Xia) 
t = 1,..., 7. Under hypothesis Ho, system (3.4.1) is always consistent. Under 
the alternative hypothesis Hj, the right-hand sides b,,..., br are independent 
random variables that are independent of the left-hand side of the system and take 
the values 0 and 1 with equal probabilities. To distinguish between the hypotheses 
Ho and Hj, we can use the consistency of the system as a test: If the system is 
consistent, we accept the hypothesis Ho, and we accept Hj otherwise. Therefore 
the hypothesis Ho is never rejected if it is true, and the error of the first kind, the 
probability of rejecting Hp if it is true, is zero. The error of the second kind, the 
probability of accepting Hp if it is wrong, is equal to the probability of consistency 
of the system (3.4.1). Thus, the probability of consistency is the main characteristic 
in the statistical problem of testing the hypotheses Ho and Hj. 

Section 3.6 is devoted to the other statistical problems that consist of recon- 
structing the true solution on the basis of a system of random equations with 
distorted right-hand sides. These results can be found in the paper [84]. 


4 


Random permutations 


4.1. Random permutations and the generalized 
scheme of allocation 


Denote by S,, the set of all one-to-one mappings of the set XY, = {1,2,...,} into 
itself. This set contains n! elements. We consider a random permutation o that 
equals any element of 5, with probability (n!)~!. 

A permutation s € 5S, can be written as 


( 1, 2, ..., 7 ) 

s= ’ 

S1, S2, ..+5 Sp 

where sx is the image of k under the mapping s, k = 1,...,. The mapping s can 
be represented also by the graph r® = T'(X,, Wn) whose vertex set is X,, and 
the edge set W,, consists of the arcs (k, s,) directed from k to 55, k = 1,...,n. 
Since exactly one arc enters each vertex and exactly one arc emanates from each 
vertex, the graph re consists of the connected components that are cycles, which 
are called the cycles of the permutation s. 

Denote by Ij, the random graph corresponding to the random permutation o, 
which takes the values s with equal probabilities. Itis obvious that P{T, = r} = 
(at)7!. 

In Section 1.3, we showed that the generalized scheme of allocation intro- 
duced in Section 1.2 can be applied to a wide class of problems related to the 
behavior of the connected components of random graphs. In Example 1.3.1, we 
showed that the generalized scheme can be used in the study of random permuta- 
tions. Recall that in the generalized scheme, we separate the subset of graphs 
with exactly N components, assign one of the N! possible orders to the set 
of these components, and denote by 7,...,w the sizes of the components. 
If there exist nonnegative identically distributed random variables é),..., &y 
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such that for any integers ky, ..., kn, 
Pini =k1,....9v =kn} = PE =h,....6n =ky | 61 +---én =n}, 
(4.1.1) 


we say that the generalized scheme determined by the random variables £,,...,&v 
is applied to the random graph. 

As was shown in Example 1.3.1, the generalized scheme that corresponds to the 
random graph I, of a random permutation from S, is determined by the random 


variables &,,..., &y with the distribution 
xk 
P{& = k} = ————_.,, k=1,2..., 0 1, 4.1.2 
== 7 o0q=p <x< (4.1.2) 


since the number of elements in S, is a, = n! and the number of connected 
realizations of the random graph I, is b, = (n — 1)!. 

For the random permutations, the corresponding generating functions have the 
form 


[o.¢) 
bx" 


Bx) = >> = —log(l — x). 


Thus the study of various characteristics of random permutations can be ac- 
complished with the help of the generalized scheme. This is demonstrated for the 
most part in [78]. 

Recall some combinatorial identities that follow from the general results of 
Section 1.3. 

Let v, be the number of cycles in a random permutation from S,. Lemma 1.3.3 
gives the equality 


(B(x))® 
P{vy, = N}= P(g) +---+&y =n}. (4.1.3) 
Ntx" 
Denote by a, the number of cycles of length 7 in a random permutation from 
Sn, 7 = 1,...,n. According to Lemma 1.3.7, for any nonnegative integers 
M1,...,Mn, 
| 
Plat =m1,....4n = mn} =|] (4.1.4) 


r=l 


if m, + 2m2+----+-nm, =n, and the probability is zero otherwise. 
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Let us introduce the generating function 


[oe 
POilisrsechy = Play =m,...,@, =mpy}ty +t 
M1 ,...,Mn 


wer eala) Ga) GG): 
a mt---myzl\1 2 n : 


where the summation is over the set of integers 


M, = {mj > 0, i=1,...,n, my +2m24+---+nm, =n}. 


Put go = O. It is not difficult to see that g(t, ..., t,) is the coefficient of u” in 
the expansion of exp{ut} + u2t2/2 +--+}: 

= out 

=0 n=l 


The generating function (4.1.5) was obtained by Goncharov and was the basis 
of his pioneering investigations of random permutations [53]. In [78], the approach 
based on the generalized scheme of allocations was used in such investigations. In 
the next sections, we will present some examples of how the generalized scheme 
of allocation can be applied to random permutations. This will supplement the 
investigations presented in [78]. 


4.2. The number of cycles 


It is well known that the number of cycles v, in a random permutation from S, is 
asymptotically normal with parameters (log n, logn) as n — oo. More precisely, 
asin —> 00, 


1 
/ 20 logn 
uniformly in the integers N such that vu = (N — logn)/,/log zn lies in any fixed 
finite interval. 
The approach based on the generalized scheme of allocation makes it possible 
to obtain the asymptotics of the probability P{v, = N} for all possible values of 
N = N(n) asn -—> oo. According to (4.1.3), for any integer N, 


(—log(i — x))% 
N!x" 
where the parameter x can be taken arbitrarily from the interval (0, 1), and &, 


...,&n are independent identically distributed random variables with distribution 
(4.1.2). 


P{v, = N} = e241 + o(1)) (4.2.1) 


P{v, = N}= P{éi+---+&y =n}, (4.2.2) 
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Thus, to study the asymptotic behavior of the distribution of v,, it is sufficient 
to obtain the corresponding local limit theorems for the sum 


tn =& +---+éy, 


where the parameter x in the distribution of the summands can be chosen so that 
obtaining the local theorems becomes simple. 

We begin with x = 1 — 1/n and prove a series of limit theorems that make it 
possible to describe the behavior of the probability P{v, = N} for the values of 
N not too far from log n. 


Theorem 4.2.1. Ifn — oo, N = y logn + o(logn), where y is a constant, 
0< y <o, then 


1 
P{ty =khh = z¥—le-2(1 + o(1)) 
: nt) 
uniformly in the integers k such that z = k/n lies in any interval of the form 
0 < Zo < Zz < 2 and Zo and 2, are constants. 


Before proving the theorem, we obtain some auxiliary results. We have chosen 
x = 1-—1/n. For suchx, 


1—1/n)* 
Pie = k=1,2,..., (4.2.3) 


and the characteristic function of the random variable &, equals 


1 ; 1 ,; 
Gn(t) = ——— log (1 —el + ze) . 
logn n 


Represent ¢,,(t) in the form 


gn{t) = — (ios (- = i) + log(1 + wit) + va(0)) , (4.2.4) 
ogn n 


where 


1+ it — et elf —] 
wit) = yr y(t) = Ajai 


For 1 (¢) and w(t), the following estimates are valid: 
it _4_; 
le 1—-it| g; {t| 


Wit| < a Tear (4.2.5) 


Loe | | 
eel oat. (4.2.6) 
ntl n 


ly2@| < 
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By using the explicit form of g, (t), the representation (4.2.4) and the bounds 
(4.2.5) and (4.2.6), we obtain the following estimates of ¢, (t). 


Lemma 4.2.1. Ifn — oo, N = ylogn + o(logn), where y is a constant, 
0 < y < ©, then for any fixed t, 


Ca ee ee 
¥n (<)> a—in’’ 


Lemma 4.2.2. Ifn — oo, N = ylogn + o(logn), where y is a constant, 
0 < y < 00, then there exist positive constants € and c such that for |t/n| < «, 


len’ (t/n)| < clt\7”. 


Lemma 4.2.3. If n — oo, then for 0 < & < |t| < 2, where « is an arbitrary 
constant, there exists a constant c such that for sufficiently large n, 


\Pn(t)| < c/logn. 


Lemma 4.2.4. If n — oo, then there exists a positive constant € such that for 


|t/n| <e, 
1 ,(t 2 
=O, =) |S ao 
n n (1 + |r|) logn 


As follows from Lemma 4.2.1, asm — ooand N = y logn+o(logn), where y 
is aconstant, 0 < y < oo, the distributions of the normalized sums ¢y /n converge 
to the gamma distribution with characteristic function (1 — it)~” and density 
z’~le-7/T(y), z > 0. Actually, as stated in Theorem 4.2.1, these distributions 
become close locally. 


Proof of Theorem 4.2.1. By the inversion formula, the probability 
P{gn =k} = Pl{tw/n = z} 


can be represented in the form 


wn s 
P{tv/n=z}=—— | eg (t/n) dt, 
27Nn Jan 
and 
1 foe) —itz 
gt ler -| ee 
P(y) —oo (1 — it)’ 

Hence, 


2anP{ty/n=z}—-2ne 7 =h+h+h+h, 
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where 


A 
n= f e (gM (t/n) — (1 —it)-’) dt 


A 


b= | eT itz oN (t/n) dt, 
A<|t|<en 


b= . e 79M (t/n) dt, 
n<|t|<an 


= - | a 6 | — it)” dt, 
A<|t| 


with the constants ¢ and A to be chosen later. 

By Lemma 4.2.1, 97 Nit/n) > (1 —it)~” for any fixed ¢. By Theorem 1.1.9, 
this means that the convergence is uniform with respect to ¢ in any finite interval. 
Therefore /; — 0 for any fixed A asn —> ow. 

By Lemma 4.2.3, for sufficiently large n, 


[13] < 2xn(c/logn)™ < 2nne?N/7, 


and, for N = y logn + o(logz), the right-hand side tends to zero as n — 00. 
To estimate [2 and 14, we integrate by parts. For J4, this leads to 


Vie 
+t f ea ate di. 
A 


Zz 


oo t —itz 00 
“N21 — it)7Y dt = ———_ 
‘D eta) iz(1 — ity’ |, 


Therefore 


|44| < _ 2 + +f _ at 
O70 4+ ADV2 2 Jy 42002 
2 2y [% at C4 


es ae Se ae 
~ zAY oz Jaq tytl — ay 


where c4 is a constant, and J4 can be made arbitrarily small by the choice of 
sufficiently large A. 
Similarly, 


oats t 
/ e tlZgN (<) dt 
A n 
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By using the estimates of Lemmas 4.2.2, 4.2.3, and 4.2.4, we obtain 
lols = 
Zz 


2| v{A 21 oN 2N [| wiiflt\ ,(t 
pie | (re = . *\| at 
Lp (=) +ilolol+— f a ol Ke 
és 1 4 e\® re i at 
c rye ya? 
= \ ar ™ \ogn 3, wH 


where c, c2, and c3 are constants. 
If we choose sufficiently large A and n, we can make |J| arbitrarily small. 


Now we can prove the following theorem on the behavior of the probability 
P{v, = N}. 


Theorem 4.2.2. Ifn — «and N = y logn + o(logn), where y is a constant, 
0< y <M, then 


_ (logn)® 


PO =} = sar) 


(1+ 0(1)). 


Proof. For x = 1 — 1/n, the representation (4.2.2) takes the form 


ay Mogn)® _ 
PONS NS s5 4 an (Iya P{tv =n}, (4.2.7) 


where ¢y = &; +---+&vy is the sum of independent identically distributed random 
variables with distribution (4.2.3). By Theorem 4.2.1, 
1 
nV(y) 


By substituting this expression into (4.2.7), we obtain the assertion of Theo- 
rem 4.2.2. a 


P{ty =n} = e'(1+0())). 


The case where y = N/logn — 0 is described by the following theorem. 


Theorem 4.2.3. Ifn — oo and y = N/\ogn — 0, then 


-1 


nl(y) 


P{ty =n} = NP{& =n}(1+0(1)) = (1 + 0(1)). 


Proof. Taking into account that y < 1/2 beginning with some n, we choose the 
level n(1 — y) and represent the probability P{¢,) = n} as follows: 


P{téyv =n} = P{ty =n, & <n(1—y), i=1,...,n} 
+NP{én =n, Ey > n(1—y)}. (4.2.8) 
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Since 
e7! 
P{g; =m} = ——(1+o(])) 
nlogn 


uniformly in m,n > m > n(1— vy), we see that 


So Py =m, ty-1 =n-m)} 
m>n(l—y) 
-1 


P{ty =n, Ev > n(1—y)} 


P{éyv-1 <yn}(1+o(1)). (4.2.9) 
nlogn 


We now prove 
P{ty-1 < yn} > 1. (4.2.10) 


Show that the random variable ¢yv /(yn) converges in probability to zero. By the 
representation (4.2.4) and the estimates (4.2.5) and (4.2.6), 


Ba) ee ae) a) 


— 1 — lesty = it) — logy o( 1 ) 
7 logn ynlogn }’ 


and if y = N/logn — 0, then 


t —it)- 1 i 
on <)= pea St Aina dre 9 are me 
yn logn yn logn 


Thus, the characteristic function of ¢v/(yn) converges to the characteristic func- 
tion of the random variable that assumes the value 0 with probability 1, and we 
obtain (4.2.10). 

With some technical difficulties, it can be proved that under the conditions of 
the theorem, 


P(t, =n, & <n(l—y), i=1,...,n} =o0(1/(nlogn)). 


The assertion of the theorem follows from this relation and the relations (4.2.8), 
(4.2.9), and (4.2.10). a 


Theorem 4.2.4. Ifn — co and y = N/logn — 0, then 


N 
P(v, = N} = aceere +0(1)). 


Proof. The assertion of the theorem follows immediately from Theorem 4.2.3 
and representation (4.2.7) if we take into account that the gamma function ['(y) = 
1/y(. + o(1)) as y > 0. | 
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Now consider the case where N/logn — oo. We distinguish four subcases: 
a=n/N>o,a>c>1,a—- 1withm=n-—N—- oo, anda > 1 withm 
fixed. 

Let a — oo. We must select the value of the parameter x so that Ey is close 
ton. Since for &; with distribution (4.1.2), 


x 
i aaa 7 REY FT 


we choose x, 0 < x < 1, such that 


x 


ey 4.2.11 
(—x)logd—x) ead 
where a = n/N. This equation is approximately satisfied if we take 


1 


x=1- : 
aloga 


If N/logn — oo, then x = 1 — 1/(@log aq) is farther from the singular point 
x = 1 than x = 1 — 1/n, and therefore the normal approximation is valid for the 
sum Cy. 


Theorem 4.2.5. If n, N — oo such that N/logn > oo, a =n/N — oo and 
the parameter x = 1 — 1/(aloga) and og = a,/log a, then 


1 2 
P{ty =k} = ———e? (1 1 
{tn =k} = road (1 + o(1)) 


uniformly in the integers k such that z = (k — n)/(og/N) lies in any fixed finite 
interval. 
Proof. The characteristic function of the random variable &, is 
log (1 — xe” 
PA epee lp 
log(i — x) 


It is easy to see that for any fixed t, as N/logn > oo anda =n/N > ov, 


ei ( : )=1- +0(Z) 
MN CIN) ON AN 
Denote by w(t) the characteristic function of ff = (¢n — n)/ (o4/N ), then 
under the conditions of the theorem for any fixed ¢, 


N 
Wn(t) = (<"“ (- =)) > et 2, 


and the distribution of ¢;, converges weakly to the normal distribution with pa- 
rameters (0, 1). 
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The local convergence can be proved by the standard reasoning and we omit 
this technical part of the proof of Theorem 4.2.5. a 


From Theorem 4.2.5 and representation (4.2.2), we obtain the following asser- 
tion. 
Theorem 4.2.6. Ifn, N — oo such that N/\logn > oo, a = n/N — ov, then 
_ (-log(1 — x))% 
N!x"o0gJV 20 N 
where x = 1— 1/(@loga) and og = a,/loga. 


The following theorem for the case where a tends to a constant greater than 1 
can be proved in the same way as Theorem 4.2.5. 


(1+ 0(1)), 


Theorem 4.2.7. If n, N — oo and there exist constants ag and a, such that 
1 < a <a@ < Q, the parameter x = xy, where xq is the unique solution of 
equation (4.2.11) in the interval (0, 1), and 


2_ Xa log(1 — xa) + x2 


(1 — xq)? log?(1 — xe)’ 


x 


then 
1 


0,/V2nN 


uniformly in the integers k such that z = (k — n)/(oxV22N) lies in any fixed 
finite interval. 


Pity =k}= e 21 + o(1)) 


Proof. The proof is similar to the proof of Theorem 4.2.5 and we omit the details. 
Note only that w = E&, and o2 = Dé, for x = xq. a 


Using Theorem 4.2.7 and representation (4.2.2), we obtain the following asser- 
tion on the distribution of v,. 


Theorem 4.2.8. If n,N — oo and there exist constants a and a such that 
1 < a <a < a, then 


(— log — xu))% 


P{v, = N} = 
en Nt xto,/2nN 


(1 +0(1)), 


where Xq is the unique solution of equation (4.2.11) in the interval (0, 1), and 


2 Xq log(1 — xe) + x2 


* (1 xq)? log2(1 — Xa) 


The asymptotic normality of ¢y is preserved if a = n/N — 1 slowly, as 
specified below. 
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Theorem 4.2.9. Ifn, N — oc such thata =n/N > landm =n—-N—>o, 
and the parameter x = Xq, where Xq is the unique solution of equation (4.2.11) in 
the interval (O, 1), then 


1 
Pity =k} = mee +0(1)) 


uniformly in the integers k such that z = (k — n)/./m lies in any fixed finite 
interval. 


The proof is similar to the proof of Theorem 4.2.5 and we omit the details. 
From Theorem 4.2.9 and representation (4.2.2), we obtain the following asser- 
tion on the behavior of P{v, = N}. 


Theorem 4.2.10. Ifn, N > oo such thata =n/N > landm=n—N— oo, 
then 


(—log(1 — xa)" 
Nix%J/20m 


where Xq is the unique solution of equation (4.2.11). 


Ply, = N} = (1+ o0(1)), 


It is not difficult to see that if m? /N — 0, then 
2m 
Xy = —(1+ O(m/N)), 
N 
and consequently 
(—log(1 — xq)” = x (1+ xq/2 + O(x2))” = xNe"(1 + O(m?/N)), 
am m 
x™ = m (1+ O(m?/N)). 
Therefore it follows from Theorem 4.2.9 that ifn, N > oo,a =n/N > 1, 
m — oo and m?/N — 0, then 


m 


N 
P{v, = N}= Woe +o(1)). 


Finally we consider the case where m is bounded. 


Theorem 4.2.11. If N — o and the parameter x = 1/N, then for any fixed 
=0; Deed 5 


Proof. By expanding the characteristic function y(t) of the random variable & 
with parameter x = 1/N, we obtain for any fixed 7, 


log (1 — xe’) 


eS Hosa) 


=e (14 2(e# — 1) + 0(%7)). 


192 Random permutations 


If x = 1/N and N — oo, then the characteristic function of 7 — N is equal to 
(1 + (e# — 1)/(2N) + O(N-2))% and tends to e@"—)/2, This means that the 
distribution of ;; — N converges to the Poisson distribution with parameter 1/2. 

a 


From this theorem and representation (4.2.2), we obtain the following assertion, 
which completes the description of the asymptotic behavior of the distribution 
of Vy. 


Theorem 4.2.12. Ifn — 00o,n/N > |, andm =n — N is fixed, then 


N™ 
P{v, = N} = Nia" m Worm +o(1)). 
It is not difficult to see that Theorems 4.2.2, 4.2.4, 4.2.6, 4.2.8, 4.2.10, and 
4.2.12 give a complete description of the asymptotic behavior of the distribution 
of the number of cycles in a random permutation of degree n as n — 00. 


4.3. Permutations with restrictions on cycle lengths 


In this section, we present some results on permutations with restrictions on their 
cycle lengths. Let R be a subset of the set of natural numbers. We consider the 
set S,,r of all permutations of degree n with cycle lengths from the set R. One of 
the first questions that arises in this situation concerns the asymptotic behavior of 
the number a,x of elements in S,,r. This problem is far from being completely 
solved. Here we describe some of the solutions provided by an approach based on 
the generalized scheme of allocation. 

Let the uniform distribution be defined on S;,,r and let v_,z be the total number 
of cycles in a random permutation from this set. Put b, r = (n —1)! ifn € R, and 
bn,r = 0 otherwise. It is easy to see that 


! bn,,R-+ +b 
P{vn,.x = N}= wT (4.3.1) 
PARR ycecapgen. le 

We introduce independent identically distributed random variables ie dts 

e\® with distribution 
k k 
(R) bx, Rx x k 
=k} = — = € R, 4.3.2 
PLE 4} KIBr(x)  kBr(x) G22) 
where 


BR(x) = ye se = 


k=1 a keR 
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By using these random variables, we can rewrite (4.3.1) in the form 


1(B x 
Plone = N) = RO pi 4. te Pan}. 43.3) 


Hence, summing over N, we obtain 


n! S. (Br(x))™ 
ann = Heat ys eae ePROp(e® 4 ...4¢% onl (43.4) 
N=1 . 


It is clear that above we have repeated the general approach of Section 1.3 for 
the case of the set S,,r, and relations (4.3.1), (4.3.3), and (4.3.4) are the realizations 
of the general relations (1.3.1), (1.3.10), and (1.3.11), respectively. 

To find the asymptotics of the numbers a, z, it is sufficient to choose an appro- 
priate value of the parameter x, substitute it into the expression of the distribution 
(4.3.2), and then prove a local limit theorem for the sum of independent random 
variables with this distribution. 

We succeed in obtaining results on ay, only if the structure of R has some 
regularity. In the general case, the asymptotics of a,,r is unknown. 

To demonstrate the approach, we consider first a simple case where R is the set 
E of even numbers. 


Theorem 4.3.1. Ifn — 00, then 
ny\vn 
ae? (=) (1+ 0(1)) (4.3.5) 
e 
for even n, and an, £ = 0 for odd n. 


Proof. To prove the theorem, we use the representation (4.3.4). We consider 
the random variables &{”), ..., €\7) with distribution (4.3.2), where R = E = 
{2,4,...}, and 

2k 


Ba(x) = Bg(x) = = 


1 
= —~ log (1 — x”). 
=a 2k 2 

The random variables é; = gle ) /2,i =1,..., N, are independent identically 
distributed, and 


2k 


P{§ = k} = ~ 2k log(1 — x2)’ 


=e De artes (4.3.6) 

If we choose x = ./1 — 1/n, then this distribution coincides with distribution 
(4.2.3) from the previous section, and according to Theorem 4.2.1, if n — 00, 
N = y logn + o(logn), where y is a constant, 0 < y < oo, then 


P(E) +---+éy =k} = z¥—le-7(1 + o(1)) 


1 
nI(y) 
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uniformly in the integers k such that z = k/n lies in any fixed interval of the form 
0 < zp < Zz < z), where Zo and z, are constants. 
Since 


(Se a O 2oG peepee, 


we obtain that if n — oo, N = (logn)/2 + o(logz), and n is even, then 


2 
P(¢ =n} = Plé +---+é =n/2} = en +o(1)). (4.3.7) 


For odd n, this probability equals zero. 

To obtain a, x with the help of relation (4.3.4), we have to sum the probabilities 
P{c Ae Vines n} with the Poisson coefficients. To this end, we need to estimate these 
probabilities for all N. We show that for all N, 


pic = n} “ 2N 


nlogn’ 


(4.3.8) 


This bound is a consequence of the following chain of estimates. It follows from 
(4.3.2) that 


(E) x” 1 
Pisv =") = Geo ky... kn’ 
E K@.N) Less 
where 
K(n, N) = {ky,..., kn: Ay t---tky =n, ky,...,kn € R}. 
Hence, 
n 1 1 
{oy j n(Bg(x))% pas ky--+kny—4 ky ++ ky 
N xh xkn-1 
n(Be@y™ 4 ky kya 
k N-1 
N x N 
<< — < ———_, 
~ n(Be(x))% sea nce k — nBr(x) 


We obtain relation (4.3.8) because B = Bg(x) = (logn)/2. 
We split the sum 
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into four summands, dividing the domain of summation into four parts: 


A = {N: B— B4<N< B+ BP}, 


{ 
A3 = {N: B+ B/4<N< B+ B’}, 
Aq = {N: B+ B? < N <n/2}. 


It is not difficult to see that relation (4.3.7) is satisfied uniformly in N € Ap. 
Therefore 


BNe-8 
= Do a Plt =") 


NEA 
Je -1/21 y BNe-B J2 
=,/—e “- (1 +0(1)) = —=( + a(])), 
XK n Neds N! nme 
since B = (logn)/2, and as N > w, 
BNe8 

> Mi > 1. 
NEA2 3 


The remaining part of the sum is o(1/n). Indeed, by applying estimate (4.3.8), 
we obtain 


asin — ©O. 
It follows from (4.3.8) that 


BXe-8 BNe-8 ae: 7 
2 erro yy <c e" /2 du < ce VB/2 
NEA; : N>B+B4/4 


where c; and c2 are constants. Hence, $3 = o(1/n). 
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Similarly, by using (4.3.8), we obtain 


1 BNe-8 1 Be\™ 
Sy < —— 2 inca —B 
+ = Togn , NE ~ logn Ds (=) e 


Hence, S4 = 0(1/n) because e/B < e7! for n sufficiently large. 
If we combine the estimates of 5S, S2, S3, and S4, we obtain 


2 1 
S=,/— - (1+o0(1)). 
wen 


Substituting this expression into (4.3.4) and expanding n! by the Stirling formula 
give the assertion of the theorem. a 


The analogous result is valid for the number of permutations for which R is the 
set of odd numbers. 

We turn now to the case where the set R is not as regular as E. Let R(k) be the 
number of elements of R that are not greater than k. Set R(O) = 0. In the sequel, 
we assume that 


lim R()/k=p, O<p<l. 
k->0o 


In this case, p is called the density of R in the set of natural numbers. 
We will find the asymptotics of a, under the following additional conditions 
on the set R. 


(1) There exists a positive integer r such that, for any nonnegative integer s, the 
set RN {s+1,...,5 +7} cannot be embedded in any integer lattice with a 
step not equal to 1. 


(2) The generating function F(z) of the set R has a finite number m of poles at 
the points z) = e?7"//",] = 0,1,...,m — 1, on the unit circle |z| = 1; in 
other words, it is of the form 


F@) =) z* = P@/(1-2”), (4.3.9) 


keR 
where P(z) is a polynomial. 


Note that, since the coefficients of the series F(z) take a finite number of values, 
by Szeg6’s theorem (see, for example, [19]), there are only two possibilities for 
F(z): Either F(z) has the form (4.3.9), or the set of singular points of F(z) is dense 
everywhere on the unit circle, and therefore F(z) cannot be extended outside the 
unit circle. We consider here only the first case. In this case, the coefficients of 
F(z), with exception of some initial numbers, form a periodic sequence with 
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period m, and, therefore, the set R has density p = //m, where / is the number of 
units in the period. 


Consider independent identically distributed random variables &1, ..., & with 
distribution 
xk 
P{é, =k} = » KER, 4.3.10 
{1 = k} KB) ( ) 
where 


Theorem 4.3.2. Suppose that R has the density p > 0 and satisfies conditions 
(1) and (2), n > oo, N = plogn + o(logn). Then 


nP{é +++» +év =k} = y? 1e?/T(p)(1 + 0(1)) 


uniformly in the integers k such that y = k/n lies in any fixed interval of the form 
0< ys ys <O%. 


With the aid of Theorem 4.3.2 and relation (4.3.9), we prove the following 
assertions. 


Theorem 4.3.3. Suppose that R has the density p > 0 and satisfies conditions 
(1) and (2). Then, as n — 00, 


Qn,R = (n — 1)! e%"8/T(p)(1 + 0(1)), (4.3.11) 


where 


Since rare = 1/n)*/k = logn, the assertion (4.3.11) can be written in the 
form 


an,p =nte*"®/T(p)(1 + 0(1)), 


where 


Theorem 4.3.4. Suppose that R has the density p > 0 and satisfies conditions 
(1) and (2). Then, as n > on, 


N — By)” 
Plvy,.z=N}= exp | -S ee | (1 +0(1)) 


n, 


1 
20 Bn R 
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uniformly in the integers N such that (N — Bn.R)/./ Bn,R lies in any fixed finite 
interval. 


To prove Theorem 4.3.2, we establish some auxiliary results. 
The characteristic function of distribution (4.3.10) is 


xke itk B(xe") 


oO rae “BG 


Lemma 4.3.1. If R has the density p > 0, then, asn — oo, 
t log(1 — it) 1 
Spas ee = 
o(£) logn +o(——) 


Proof. We first derive some auxiliary estimates. It is easy to see that 


oo 
Yox* = Do x*(RW — R(E- 1) 


keR k=1 


[oe) [oe 
= Yo xFRD yi xt RE ah 
k=1 k=1 


for any fixed t. 


feo} 
=(1 —x) }>x*R(b. 


k=1 


Set ¢ = logn. For such e, 


Yo xFRID < DO kb <log?n, 


1<k<e 1<k<e 


and, since R has positive density, 


Sted) = RO _ Weston toc. 


k>e k>e k>e 


Thus, as” — oo, 


yoxt oda) Soest ot) +0 (HE re) 


keR k=1 


log” n 
pt-0 (75 Sill +000) + o( ji ) 


pn +o(n). (4.3.12) 
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Similarly we obtain the estimate 


k 
Bax)=)> = = plogn + o(n). (4.3.13) 
keR 


We now write the characteristic function in the form 


_ B(xe't/”) = 1 it/n 


It is easy to see that 
B(xei*/") — B(x) 


= 1 ck(eitkin = 1) 


keR 
foe) 1 ; 
= eee — 1)(R() — R&D) 
k=1 
= 3 1 ERM ee -1- is (et @t Din _ 1) 
k k+1 
k=1 
= 1 k itk/n it/n 1 it(k+1)/n 
= Do ox Rj ( e'*/"(1 — e'*/") + —(e —1) 
k n 
k=1 
Ly ittkttyjn _ 1 it(k+l)/n _ ) 
a rey A 1) + eae 1)}. 


First of all, we estimate the part that does not contribute essentially to the sum. 
If ¢ is fixed and n — oo, then 


x RK) it(k+1)/n xt I 
a e+ Dn? =) 9 de -0(-). 


k=1 k=1 


We transform the other parts of the sum as follows: 


CO 
> ERD eikin( = eit/n) 


k=1 
OO Yk OO Lk ; 
- x R(k) itk/ x R(k) itk/ It it / 
=-it) ae aed "+ ee a reas m 
k=1 k=1 
CO fk foe) 
: x" R(k) itk/n 1 k 
=-it> >= os=> 
i 2. x e “p ) 2% 
OO Lk 
: x" RK) ; 1 
=! - ) gitkin 4 9 (<) (4.3.14) 


k=1 
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and 
s x* R(k) (eit(+D/m _ 4) 
paar kn 
FRR) tk SFR) | ' 
= /n _ itk/n(pit/n _ 
= d = (e 1+ » ae (e 1) 
= aa : eee elttin 1) 40 (5) (4.3.15) 
Similarly, 
xR) ¢ ites — xFRID 4 logn 
ge EERE /n 1) = AON (oltk/n eee. 
Uae” ) » pO 1) +0( n ) 


Set ¢ = logn and E = n logn. Then 


yr ERO sutra] ~ yo ke o(“2*), 
k<e kn Mae 
k 
y- RO) (gitkin _ 1) <b t=0("E "), 
k<e kn k<e 
+ eRe) (eithin =i) | wes ~ Sox ke o(“2*). 
k<e k Wye 


In exactly the same way, 


s ERD gitkin| - yx ee en 
ee " ESE a 
k 

3 x — (eitt/n _ 1) <5 okt 2 ner 

k>E ‘3 k>E 
k 

eee rs eS, 

k>E k>E 


It is clear that 


R(b)/k=ptod),  x* =e*/"(1 + 0(1)) 
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uniformly in k, e < k < E. Hence, 


k 
XR) itk/n 1 _g—it)/n 1 hin 
oz SBE | e =p > rag +o . ys e 


E<k<E E<K<E E<k<E 
=p = ek(l—it)/n +o(1). 
exk<E" 
Similarly, 

k 

> x Rk) (eitk/n = =p ~ = en eitk/n = 1) +o(1), 
kn 

e<k<E e<k<E" 


xFR(K) ing 11 ging itk 
yap Oa) Sey ce ett a1) +00. 
E<Kk<E ESkSE 


The sums in the right-hand sides of these relations are integral sums of integrable 
functions. Therefore, as n — oo, their limits exist and equal 


2 1 
/ e U-iNzg, = be ss 
0 1-—it? 


re itz _ 1 
[ e7(e oat ame tease 


oO 4 ; 
/ -e7(e'” — 1) dz = —log(1 — it), 
0 


y4 


respectively. Thus, as n — oo, for any fixed ¢, 


B(xe''/") — B(x) = —p log(1 — it) + o(1), 


F t 54, 1 
n logn logn 


Lemma 4.3.1 implies that for any fixed ¢, asm — co and N = p logn+o(logn), 


and hence, 


n({t 1 
ae 1), 4.3.16 
9 (*) q=pe (4.3.16) 
and for the normalized sum (&; + ---+&y)/n the limit distribution is the distribu- 
tion with the characteristic function (1—it)~ thathas the density y°~!e-”/T'(p). 

To prove the local convergence of the distributions, we have to estimate p(t /n) 
outside a neighborhood of zero. 
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Lemma 4.3.2. Suppose that R has the density p > 0 and satisfies conditions (1) 
and (2). Then, for any € > 0, there exists gq < 1 such that for € < |t| < x, 


lo@| <q. 


Proof. Let ki, k2, and k3 be integers and aj,, ax, ax, > 0. It is easy to verify 
that 
|ax,e1™ + ayei® + aj,e' |? = (ak, + ak, + ak)” 
— ak, Ak (1 — cost (ke — k1)) 
— 2az, ax, (1 — cos t(k3 — ky)) 
— 2ak ax, (1 — cos t(k3 — k2)). 


For a > Oandé > 0, 
a— Va? —& > 8/(2a). 


Therefore 


Ak, + Aky + ak, — Jax,e4"* + ay,eih 4 a,,e'"® 


a. Ak, Ak, (1 — cos t(k2 — ky)) 
Qk, + Ak, + Ak, 


ak, 4¢,(1 — cos t(k3 — ky)) 
Aky + Aky + Akg 


Ak, Ak, (1 — cos t(k3 — k2)) 


(4.3.17) 
Ak, + Aky + Akg 


Suppose now that, as in condition (1), the integers k1, k2, and k3 do not lie on 
any lattice with a step greater than 1 and are contained in an interval of length r. 
Then, for ¢ < |t| < 7, the three cosines from the right-hand side of (4.3.17) do 
not simultaneously take the value 1. Moreover, since k1, k2, and k3 are contained 
in an interval of length r, their differences can take only a finite number of values. 
Therefore, there exists a > 0 such that for e < |t| < z, 


(1 —cos t (kp —k1)) + (1 —cos t(k3 — k1)) + 1 —cos t(k3 —k2)) => 3a = (4.3.18) 
uniformly in all such ky, k2, and k3. 
We now let a, = x* /k, k = 1,2,..., and suppose condition (1) holds for 


ki > kz > kz. It follows from (4.3.17) and (4.3.18) that 


Qk, + ky + ak, — a," + aj,e'® + ag, et" | > Wad, . (4.3.19) 
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Write the characteristic function g(t) in the form 


oo 6orltr 
gn=>> >> Be RY - R(k — 1))ei*. 
1=0 ne 
From every set {7/ + 1,...,71 +7}, select, according to condition (1), three inte- 


gers ky, ko;, and k3; from R that do not lie on any integer lattice with a step not 
equal to 1. Using estimate (4.3.19) gives 


Ay + Aky + Oks, — laeye’! + ape" + ape" | > wa,agys > 0a, Arl4r. 


Therefore, taking into account that R(kj) — R(ki; — 1) = 1 fori = 1,2,3 and 
1=0,1,... yields 


co ori+r 
B(x)l@@®)| < > > ae( RK) — RK - 1)) 
1=0 k=rl+1 
foe) 
> Yay + Ak, + Oks) 
1=0 


fo.) 
rr he 
+O lake! + ape + apye!| 
1=0 
xr d+) 


eel 


(4.3.20) 


Inequalities (4.3.20) imply the assertion of Lemma 4.3.2 because r is fixed, x = 
1—1/n, and, asn > ow, 


OO r+) 


a I+1 


B(x) = plogn+o(logn), = —log (1—x") = logn+o(logn). 


Lemma 4.3.3. Suppose that R has the density p > 0 and satisfies conditions (1) 
and ((2). Then there exist cy and ¢ > 0 such that for every! =0,1,...,m—1 
and |t/n — 2nl/m| < «, for sufficiently large n, 


1 


‘KOK 


n 


Ci 
J1+ (t —2xIn/m)? logn’ 


Proof. We start by estimating 


Ass k itk/n 
n® (i) BO meee 
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By condition (2), there exist c, 6 > 0 such that for |z| < 1, |z} —z| < 6, /= 
0O,1,...,m—1, 


Ys 


keR 


, 1=0,1,...,.m—1. (4.3.21) 


- oe 


Set z = xe!'/", where x = 1 — 1/n. It is clear that |z| < 1 and there exists 
€ > 0 such that if |t/n — 221/m| < «, then |z; — z| < 6 for sufficiently large n. 
Therefore, (4.3.21) implies that for |t/n — 271/m| < ¢,1 =0,1,...,m—1, 


cn 
Dl 


- — xeilt/n—2ml/m) 
tee Bcx)|1 — xet@/n—2nt/m)| 


1 
< 
~ Bix) 


2 con 
~ Bx) V1 + (t — 2aIn/m)2 


Since B(x) = p logn + o(logn), there exists c; such that for every / = 0,1,..., 
m — 1, if |t/n — 221/m| < e, then 


4 (; ) cin 
\-y s<OS 
n logn/1 + (t — 2aIn/m)? 
for sufficiently large n. a 


We now proceed to estimate the characteristic function g(t) in the intermediate 
range of t. Obtaining the estimate involves some technical difficulties. So, for the 
sake of greater clarity, we first treat the case R = N. In this case, 


k 


x 
=P =k —————, Ra 1, 2 
Pe = Pl = Kh = 


B(x) = —log(1 — x) = logan. 


Consider the random variable = & — &. Its distribution is symmetric, and for 
m > 0, 


P{é = -> Pk Ph+m- 
Let 


oO 
Q(t) = >| Pnel™ = po+2 > Pm costm. 
m 


m=1 


It is clear that the characteristic function y(t) of the random variable & is related to 
Q(t) by the equality g(t) = |y(t) |. To estimate $(t), we use a standard inequality 
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(see, e.g., [49]): For t > 0, 


1—¢(t)=2 = Pm (1 —costm) > >> SS Bes (4.3.22) 


s=0 meM, 


where 


Lemma 4.3.4. For mo > 0, 


CO 
ie a aw ee, 
m>mo 1>2mgok=1 
Proof. By using / = m + k as the variable of summation, we obtain 


I—mo 


Rey y vee > aes x Pk PI. 


m>mo m2>mo k=1 1>mo+1 k=1 l=1 k=l+mo 


(4.3.23) 


The right-hand side of (4.3.23) is estimated from below by the quantity 


3 ae he > pr. 


1>2m9-—k=1 1>2mo 


To see this, it is sufficient to delete the first terms in the first sum from (4.3.23), 
retaining 


l—mo 


3 Pl ye Pk» 


1>2mo k=1 
and, in the second sum from (4.3.23), to shift the domain of summation to 2mo, 
giving 
fo. @) 
Ya Ym 
1>2mq — k=I—mo+1 


which does not exceed the second sum from (4.3.23) by the monotonicity of the 
probabilities. rj 


Lemma 4.3.5. ForOQ<t <x, 


ee | 
1-GH>5 Dm. 


k>n/t 
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Proof. Note that the summation on the right-hand side of (4.3.22) occurs over 
integers m from an interval of length 2 /t. If we enumerate intervals of such alength 
on the positive semi-axis starting at the point 7/(2¢), the domain of summation 
will consist of the intervals labeled by odd numbers. Notice that the sequence of 
probabilities pg, k = 1,2,..., is monotone, and the numbers of integer points 
in any two intervals of length 2/t differ by at most 1. Therefore each interval of 
length z/t for O < t < 7 contained in the right-hand side of the sum (4.3.22) 
contributes not less than one-third of the total sum of the two following intervals: 
the interval itself and the interval adjoined to it on the right side, which does not 
belong to the initial domain of summation. (Note that, as t — oo, the number 
of integer points in one interval increases and its contribution to the sum tends to 
1/2.) Therefore, (4.3.22) implies 


0° 
4 e 2 = 
1-¢@)>2)5 y Pm 2 3 y Pm- 
s=0 meM, m>n /(2t) 


By applying Lemma 4.3.4, we obtain the assertion of Lemma 4.3.5. a 


It remains to estimate the sum of the form }°,., px from below. If we use the 
inequality 


1-1jn>ze Ved, 


we obtain 
k -k/(n-1) fort -y 
7 oe — =| “dy > 3 — log — 
k>a k k>a K/(n —In-1 a/fn-1) ¥ 


(4.3.24) 


where c3 is a constant. 
We use Lemma 4.3.5, set a = 2n/|t| in (4.3.24), and obtain for |t|/n < 7, 


~{t Hn 
1-6(5) = 3 3 2 me saw (% 8a) 


lzxn/|t| 


1 
= 3BG 3B aay (OS Hl +4), 


where cq is a constant. Hence, we go on to estimate g(t/n) and find that 
1/2 
o t 2 1, — oglél tea —— _ log |t| +c4 
n 3logn 6logn 
If N > } logn, then 


(7) 


N 


1 
< exp |- 7c log |t| + =} est (4.3.25) 


on ogn 
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We now return to the case R C N. We retain the notation g(t) and g(t) for the 
characteristic functions and set 


Spey. per Boe) = Drax, ae =o; 
Pk= 1a Bix)’ : oe >» a= 


Sr(k) = Ofork ¢ R, anddr(k) = 1 forke R. 


Lemma 4.3.6. Suppose that R has the density p > 0 and satisfies conditions (1) 
and (2). Then, for |t|/n < = and N > 5plogn, 


t N 
c (*)| S eglt(- 02"), 
3 io 


where r is defined in condition (1) and c¢ is a constant. 


Proof. We revise the arguments leading to estimate (4.3.25). Inequality (4.3.22) 
now takes the following form: For ¢ > 0, 


3 = > agx* 3p (k)Akamx*t™ dp (k +m), 


s=0 meM, k=1 


2) os 

1 — |g()| a 5 
where 

nx  2ns 3x 2ns 

: <m< —+—}. 

2t EO? ot t 
We retain only one summand in each interval of length r, replace this summand by 
the minimum value over the interval, and use the transition from the sum over one 
interval of length r to one-third of the sum over the interval of twice the length. 
Then we obtain for ¢ > 0, 


lo.@) 

1 
So YS ae x*t" Sp (k +m) > ; > Gegext", 
s=0 meM, rl>n/(2t) 


Once again, we preserve only one summand in each interval of length r and get 
1-lg@? = x* Sak Ogsrix® tlt 
Ip(e)| sie sim RC : yo ake 
>n/(2tr) 


foe) 


2 I 
SBT Da trmt™ arma 
m=1 


l>n/(2tr) 


IV 


xrm xr mt) 


- FoR os oe mm+l- 


l>n/(2tr)m=1 


The assertion of Lemma 4.3.4 is based on the monotonicity of the probabili- 
ties py, k = 1,2,.... The summands of the last double sum are similar to the 
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summands of the sum in Lemma 4.3.4, and the values x’* fk, k =1,2,..., are 
also monotonic. Therefore we may use Lemma 4.3.4 and obtain 


rl 


2 xtl SO yrm log(1 — x”) x 
beWOl Spare so ae oe ee 
3B (x)r l>n/(2tr) l m=1 a 3B (x)r l>x/(tr) 


For a fixed r, the estimate (4.3.24) remains true. Therefore, by taking into account 
the asymptotics B(x) = p logn + o(logn) and — log(1 — x”) = logn + o(logn), 


we find 
t 
_ ae 
(5) 


1/2 
ai log il +¢ < exp {—esitlte | 
3r2o7 logn 6r2p2 logn 


2 
(log || +c). 


St 
~ 3r22 logn 


Hence, 


and for N > 5p logn, 


t N 
f (7) < cgle| 2") 
n _— > 


where c¢ is a constant. a 


Proof of Theorem 4.3.2. Consider the sum ¢y = | + --- + &y of independent 
identically distributed random variables with distribution (4.3.10). As we have 
seen, Lemma 4.3.1 implies that, asn — oo and N = plogn + o(logn), the 
distribution of £\/n converges weakly to the distribution with density 


uPle"/T(p), u>0. 


We now prove the local convergence of these distributions. For an integer k, let 
y =k/n. By the inversion formula, 


1 1 Te es t 
P{ty =k} =P4-Cy=y == | eit gN (<) dt, 
n 2mn J_xn n 


where g(t) is the characteristic function of the distribution (4.3.10). The density 
of the limit distribution at a point u > 0 can be represented by the integral 


up-le™ = 1 [ 1 e itu du 
P'(p) 2m J—oo (1 — it)? 


Hence, 


2nnP{ty =k} — ny’ le /Tw)=hth+h, 
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I -[ -ity (gn (1 2 dt 
eg NS Na epee 
—it 1 
h=- e 'Y—¥§____. dt, 
AS|t| (1 —it)e 


. t 
i, evo (<) dt, 
As|t|<xn n 


and the constant A in the integrals is to be chosen later. 

By (4.3.16), for any fixed A, the integral /; tends to zero as nm — oo and 
N = plogn + o(logn). 

To estimate the integrals J and J3, we integrate by parts. For J, this yields 


oe) —ity —ity oo fore) —ity 
i esi = pte ee ats + 4) a dt. 
a (1-it)e iyl—it?|, yJg (l—ityet! 


ine 2 re 2 i dt 
aS y+ A2)e/2 op Jy (+ 42)e+D/2” 


and |J2| can be made arbitrarily small by the choice of A. 


where 


Bb 


Hence, 


Similarly, 
mn : t —ity t mn 
| etgN | —) dt = a gy’ (- +1, 
A n ly "/ \a 
where 
= Nv m PE is F 2 dt 
iy J4 nj n n 
Therefore 
2 2) f ANY 
IB] < -le@r)" += lo (=) + JI. 
¥ ¥ n 


When we use the estimates of Lemmas 4.3.2 and 4.3.6, we obtain 
lea)IX <q%, 9g <1; \p(A/n)|% < e571 C2r* 0) 


Hence these summands can be made arbitrarily small. It remains to estimate the 
integral J. Choose ¢ such that Lemma 4.3.3 is valid, and represent J as the sum of 
three integrals: 


T=](e)+ b(e)+ Be), 


210 Random permutations 


N : t\ 1 t 
neo=Xf _ emott(t) ty(2) a 
LY JA<|t|<en nyn n 


In(e) is the integral over the sum of e-neighborhoods of the poles of F(z), that is, 
over the sum that equals 


ae nin QnIn 
U —en+ éen+ a ; 


ms ’ 
f=1 


where 


and J3(¢) is the integral over the remaining set 


ae 2nIn 2nIn 
Ae = [—2n, —en] U [en, mn]\ U [-= + sent : 
= m m 


By using Lemmas 4.3.3 and 4.3.6, we find 


2Nc1c6 [ 1 ~1d2r2 
I < ——s at /(12r dt, 
| 1(€)| = ylogn y al + f2)1/2 


and for y > yo > 0, the value |J;(¢)| can be made arbitrarily small by the choice 
of a sufficiently large A. 
By using Lemma 4.3.3, we find 


1 en+2nIn/m t C1 en+2nln/m dt 
ga Oe a 
WH J—en+2nIn/m n logn —ent+2aIn/m /1+ (t — 2nIn/m)2 


Cl [ dt 
logn -en V1 +r 


and there exists a constant c7 such that for a fixed ¢, 


én dt 
< c7logn. 
-en V14 t2 


Therefore, we use the estimate of Lemma 4.3.2 and find that for y => yo > 0, 


me\caNqN-! i dt 
ylogn -en V1 4+ 02 


and under the conditions of Theorem 4.3.2, the right-hand side tends to zero. 
Fort € Ag, 


[Lo(e)| < < meicac7y9 Ng, 


ly (t/n)| < cg/ B(x), 


where cg is a constant that is the upper bound of |F'(z)| for |z| = x not in the 


4.3 Permutations with restrictions on cycle lengths 211 


neighborhoods of the poles. By using this estimate and the estimate of Lemma 
4.3.2, we find 


N es t 1 t 
l(e)| < — J |e (5) —|¢ (<) dt 
y JA nj\n n 
eM gt fa 
yn B(x) Ae 
< cgN N-1 
yB(x) 
Under the conditions of the theorem, the last term of this chain of inequalities 
tends to zero for y > yo > 0. | 


It is easy to see that by first choosing a sufficiently large A and then a sufficiently 
large n, we can make the difference being estimated arbitrarily small. Note that 
the difference is bounded uniformly with respect to NV, and hence, there exists a 
constant cg such that for y > yo > O and for all N, 


P{¢y =k} <co/n. (4.3.26) 


Proof of Theorem 4.3.3. In (4.3.8), divide the domain of summation into two 
parts: Ny = {N: |N — B(x)| < N?/9} and Nz = {N: |N — B(x)| > N7/3}. 

It is not difficult to see that the assertion of Theorem 4.3.2 is fulfilled uniformly 
in N € Nj. Therefore 


aP{ty =n} =e !/T(p)(1 + o(1)) 


uniformly in VN € Nj, so 


N 
yp BON" 2 P(Ey =n) 
NI 
Nen, 


is (BO) _ a) _ et! 
~ nP(p) Loy FO) Sto): 


Nem I'(p) 


We use the estimate (4.3.26) and obtain 
N N 
B 
NEN? f EG NeN?2 . 


Since the sum on the right-hand side of this inequality tends to zero, the total sum 
in (4.3.4) equals (enT'(p))~!(1 + 0(1)). It remains to note that 


1 1\* 
x" =e '(1+0(1)), Bix) = Bar = > i (1 _ ~) : 
keR 
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Proof of Theorem 4.3.4. According to (4.3.3), 


n!(B(x))¥ 


ARS area 
! n, 


P{ty =n}. 


If we substitute the corresponding expressions for @n,x and P{{v = n}, we obtain 


N 
oy e FI + o(1)) 


Plinr=N}= 
for N = B(x) + 0(B(x)). We note that B(x) = B, pz and that the expression 
obtained above holds uniformly in N such that (N — B(x))/./ B(x) lies in any 
fixed finite interval; thus, we obtain the assertion of Theorem 4.3.4. a 


4.4. Notes and references 


The probabilistic approach that is now commonly used in combinatorics was first 
formulated in an explicit form and applied in the investigations of the symmetric 
group S, by V. L. Goncharov [51, 52, 53]. For the random variables a1, ..., @n, 
he found the joint distribution (4.1.4) and the generating function (4.1.5). For the 
total number of cycles v, = a; + --- +n, he proved that, asm — oo, 


Ev, = logn+y +o(1), 


VD», = Jlogn — (x*/2 — y/2)/Vlogn + o(/logn ). 


Goncharov also proved that the distribution of (v, — logn)/,/logn converges 
to the standard normal distribution, and the distribution of a, converges to the 
Poisson distribution with parameter 1/r. 

Let B,,, be the length of the maximum cycle in a random permutation from Sy). 
Goncharov [51, 53] showed that 


— (-1)" 
Pi, <m} =) —Silm,n), 
h=0 : 
where 
1 
Som.n)=1, Smny= YP Ee, he. 
ky t--+hy <n, 
ky anes) k,>m 
Let 
dxj-+-d 
hilaset: Fee itao= / esied intel ay | ee 
X10 XpA 


xy+--+x,<1—x, 
XY yey HA >X 
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Goncharov proved that, as n — oo, the random variable ,, /n has the distribution 
with the density 


(x) 1 Cy pape. eta! 
xyj=- xX,t—X), Seas 
x h! 1+2 Xr 


which, as is clear from the preceding formula, is defined by different analytic 
expressions on the sequential intervals of the form [1/(1 +A), 1/4], where A is an 
integer. For example, 


1 1 
@@)=-, sxe; 
x 2 
ae tee =*) ene 
x Sep Ba eS 


Although Goncharov investigated the cycle structure of random permutations 
in great detail, these problems continue to be of significant interest to mathemati- 
cians. V. F. Kolchin [71] proposed an approach based on the generalized scheme 
of allocation. The results on the asymptotic properties of random permutations 
obtained with the help of this approach are presented in [78]. Note that, among 
the others, the asymptotic logarithmic normality of the middle terms of the series 
of order statistics composed of the lengths of cycles, and the local limit theorem 
on the convergence of the distribution of the total number of cycles v, to the nor- 
mal distribution were first proved by this method. It is clear that this approach 
makes it possible to investigate the asymptotic behavior of the local probabilities 
P{v, = N} for all possible values of N = N(n) as n — ov. These investiga- 
tions were carried out in [109, 115, 117, 146, 147, 148]. In Section 4.2, the results 
of these investigations are presented. Theorems 4.2.1 and 4.2.2 were proved by 
Yu. Pavlov in [115, 117]; and Theorems 4.2.5, 4.2.6, 4.2.9, 4.2.10, and 4.2.12 were 
proved by L. M. Volynets in [146, 147, 148]. 

Methods of estimating the rate of convergence in limit theorems for sums of 
independent random variables are well developed in the theory of probability. 
Therefore the approach that reduces the study of characteristics of random per- 
mutations to problems concerning the sums of independent summands provides 
an obvious way to obtain the limit theorems containing estimates of the rate of 
convergence. The estimates under the conditions of Theorem 4.2.1 were obtained 
by Yu. Pavlov [117] and for y = 1 by A. Pavlov [109]. The following result of 
Volynets [146] provides a better bound than the one given in [109]. 


Theorem 4.4.1. Ifn > oo, N =logn+x,/logn, x/,/logn — 0, then 


ay Mogn) |x| I 
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Volynets [146] proved this theorem by using the approach based on the gener- 
alized scheme of allocation. 

Let &,, be the set of all single-valued mappings of the set {1, ... , } into itself. 
In particular, 5, C X,. The random mappings from &, were first studied by 
J. B. Kruskal [94] and B. Harris [57], and many studies have considered subsets 
of &,, which are distinguished from =, by various constraints on the mappings. 
We mention only the articles by V. N. Sachkov [128, 129, 130], in which the 
mappings have the height of less than a fixed number, and cycle lengths are from 
a fixed set; the articles by A. A. Grusho [54, 55], which treat the subset 5,,,, that 
consists of the mappings from &,, whose vertex degrees are not greater than 7; the 
articles by Yu. Pavlov [114, 115] considering the characteristics of the mappings 
with exactly m components (the case m = 1 is considered by G. N. Bagaev in 
[8, 9]); and the article by J. Arney and E. A. Bender [5], which treats mappings 
with constraints on degrees of the vertices. The research in these directions began 
in the early seventies and is still ongoing. In our opinion, the most surprising results 
concerning mappings with constraints were obtained by I. B. Kalugin [64], which 
we summarize. 

Let X,,r be the subset of mappings from &,, such that the degrees of the vertices 
take values only from a set R that contains zero and does not coincide with the set 
{0, 1}. 

Let (A) be a random variable with the distribution 


yee 
P{é(A) = k} = ———_.,,_keR, 
{§(A) = k} EP(RAD E 
where A is a positive constant and 
ye 
P(R,A) = 7 
keR : 


There exists wr such that E&(az) = 1. Denote by Bp the variance of the 
random variable (az). For the number of cyclic vertices rh and the height 
T,R Of the random mapping from Ly, r, the following assertions are well known 
[64, 78]. 


Theorem 4.4.2. If n — oo, then 
Jn/BRP{A® = k} = ze? + o(1)) 


uniformly in the integers k such that z = k./ Br/n lies in any interval of the form 
0<29<72<7<o&. 


Theorem 4.4.3. If n — oo, then for any fixed x > 0, 


oO 


P{/Br/n tr < x} > > (—1)ke“#27/2, 


k=—00 
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An unexpected result appears if we consider the set L* p of mappings from XZ, 
defined as follows. If in the graph of a mapping from &,, we delete the edges that 
connect the cyclic vertices, we obtain a graph consisting of trees. The set LF p 
contains the mappings from &,, such that the degree of any vertex of the trees 
takes a value in R. Thus the difference in the restrictions on the degrees in LF p 
and &,,,r seems to be insignificant because only the restrictions on the degrees of 
cyclic vertices differ by 1. But the sets XZ, z and ur pz have a substantial difference 
in the structure of their corresponding random graphs. 

Let A® and e R respectively, be the number of cyclic vertices and the height 
of arandom mapping from the set U7 p with uniform distribution. For the random 
variable &(A), set 


ar = E&(1), b%, = Dé(1). 


If R does not coincide with the set of all nonnegative integers, then az < 1. 


Theorem 4.4.4. Ifn — 00, then 
1 
brv2n0n 


uniformly in the integers k such that z = (k — (1 —ap)n)/(brJ/n) lies in any fixed 
finite interval. 


P{AQ =k} = e?(1 + o(1)) 


Theorem 4.4.5. Ifn — oo and t = t(n) is such that na > B, where B is a 
constant, then for any fixed integer m, 


P{ttp<t+m} =exp{ —kpBaR} +0(1), 
where the constant kg depends only on B and the set R. 


Since t = ¢(n) is of order log, the random mappings from 7 pz have many 
cyclic vertices and, as a consequence, have the height of order logn rather than 
./n as in the case for the mappings from &,, zr. A satisfactory explanation for this 
situation is not known. 

In Section 4.3, we considered the set S;,,p of all permutations of degree n with 
cycle lengths from a fixed set R. The interest in such sets may be partly explained 
by their connection with the equations involving permutations, which we will look 
at in the next section. Another reason for investigating the set S,, rz and similar sets 
of mappings with various restrictions is the possibility (see [5]) of approximating 
more complicated sets of combinatorial objects by such sets with relatively simple 
constraints. Partly for these reasons, the asymptotic behavior of the number ap, 
of elements in S,, rz has been considered in some recent studies [25, 80, 102, 149, 
153, 154]. 
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The generating function f(z) for the numbers a, r of elements in S,p is 


f= mp 2b 


n-0 reR 


Therefore it is convenient to apply the saddle-point method to obtain the asymp- 
totics of a,,r. By this method, the cases in which the elements of R form an 
arbitrary arithmetic progression are considered in [25, 107]; see also [130]. 

The application of the Tauberian-type theorems is another approach that has 
been used in the investigations of this problem [153, 154, 155]. Let R(m) be the 
number of elements of R that are not greater than n and let |.A| be the number of 
elements in A. 


Theorem 4.4.6. Letn — oo, 
Rin)/n—> p, O0<p<1], (4.4.1) 
and form > n,m = O(n), 


“ick Sn, keER, m—-keER| > 0°. (4.4.2) 


Qn,R = (n — 1)! exp{ln,r — yo}/T (oe) + 00), (4.4.3) 


where 


In,R = > -, 


reR,r<n 


y is the Euler constant, and T is the Euler gamma function. 


Conditions (4.4.1) and (4.4.2) indicate that the set R is similar to a typical 
realization of a random set containing each positive integer with probability p 
independent of the other integers. 

As examples of the sets R that satisfy conditions (4.4.1) and (4.4.2), we may 
take sets of the form 


R= {k: {g(k)} € A}, (4.4.4) 


where g(t) is a real-valued function of t > 0, {x} is the fractional part of x, and 
A is an interval or a finite union of intervals from [0, 1] with the Lebesgue mea- 
sure p. 

A. L. Yakymiv [154, 155] proved that a set R of the form (4.4.4) satisfies 
conditions (4.4.1) and (4.4.2) if 


g(t) =7°1(t), 
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where @ is a noninteger positive number, /(t) is a Slowly varying function, and as 
t > 00, 


d” — _ 
allt) = oft (t)), n=1,...,[e]+2. 


Let @;, x be the number of cycles of length r in a random permutation of S, Rr 
and let vn,r = @1,R +---+n,R be its total number of cycles. Yakymiv [154, 
155] proved the following assertions. 


Theorem 4.4.7. Suppose that conditions (4.4.1) and (4.4.2) are satisfied and 
n —> oo. Then the distribution of the random variable (vn,r — In,r)/V p logn 
converges weakly to the standard normal distribution, and for any fixedr € R, the 
distribution of a,,p converges to the Poisson distribution with parameter 1/r. 


A case of irregular behavior of a;,,r is considered in [149]. 


Theorem 4.4.8. Ifn — oo and R= EU M, where E is the set of all even posi- 
tive numbers and M is a set of odd numbers such that the series 


b= y= = 
converges, then 
an,R = (< y’ (e° +e )(1 +o(1)) 
for even n, and 


an,R = (=) (e? - ey + o0(1)) 
for odd n. 


Volynets [149] proved this theorem with the aid of relation (4.3.4), in which 
she uses the representation 


P(e? +... +8? =n} 
=) Plv=m, n=s}P{g4..-46 =n—s}. 


s,m 


Here the variables ae baie Eye have the parameter x equal to ./1 — I/n, v is 
the number of these variables taking values in M, n is the sum of these variables, 
and g(F Oe 5 ao g i ) are independent identically distributed random variables with 
the distribution 
er 
klogn 


Note that if b —> 0, the result of Theorem 4.4.8 transfers continuously to (4.3.5). 


, KEE. 
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Theorems 4.3.2, 4.3.3, and 4.3.4 are given in [80]. It can be easily shown 
that the asymptotics (4.3.11) and (4.4.3) are identical. Thus, quite different sets 
of conditions yield coinciding results. This coincidence shows that there exist 
weaker conditions sufficient for the validity of the asymptotics (4.3.11). We give 
the detailed and cumbersome proof of Theorem 4.3.3 because we conjecture that 
condition (1) from this theorem and the existence of a positive density of R are 
sufficient for the validity of (4.3.11) and that it may be possible to simplify the 
proof. 

The research on the sets S,_z of permutations with restrictions on the cycle 
lengths provides an example of a fruitful competition of various analytical meth- 
ods of asymptotic analysis such as the saddle-point method, the application of 
Tauberian-type theorems, and the approach based on the generalized scheme of 
allocation. 

Note that it would also be interesting to consider the cases where the density 
p=0. 


5 


Equations containing an unknown 
permutation 


5.1. A quadratic equation 
If g and / are permutations of degree n, then the result of their sequential action 
h = fg is a permutation of degree n called the product of g and f. The set 5, 
of all permutations of degree n with this operation is the well-known symmetric 
group of degree n. Therefore we can consider equations of the form 


X47 =a, (5.1.1) 


where d is a positive integer, a € S,, and X is an unknown permutation from S,,. 
In the previous chapter, we considered the set S,,,z of all permutations of degree n 
with cycle lengths from a fixed set R and found the asymptotics for the number of 
elements in S,,r for some regular sets R. The interest in the sets of permutations 
Sn,R may be partly explained by their connection with some equations involving 
permutations. For example, the set of all solutions of the equation 


XP =e (5.1.2) 


in the symmetric group S,, where e is the identity permutation and p is a prime 
number, is exactly the set S,, rp with R = {1, p}. Indeed, a permutation X satisfies 
equation (5.1.2) if and only if its cycles are of the length 1 or p. Denote by Ti? ) 
the number of solutions of equation (5.1.2). 


Theorem 5.1.1. If pis a prime number, then 


(Pp) 1 
TY?) = Se : 
= tk pk 
ae (n — pk)! k! p 


Proof. Let o be a random permutation from S,. It is clear that 


Plo? =e} =” /n!, 
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and the study of 7? ) is equivalent to the study of the probability P{o? = e}. 


Since T = 4n,R, Where R = {1, p}, 


{o? =e} = (a, =0, r £1, r F p} = (a1 + pay =n}, 


where a, is the number of cycles of length r in a random permutation from S,,. By 
(4.1.4), 


1 
Pla =n -— pk, ap =k, ay = 0, ri, CED = pe pe 
Summing these probabilities over admissible values of k yields the assertion of 
the theorem. a 


Set ao,r = 1 and consider the generating function of the sequence a, r, 


SR) = Ss aie : 


k=0 


Theorem 5.1.2. 


fa(2) = exp eI 


reR 


Proof. According to (4.1.5), 


oe) 
ou, ty, ta, eae) _ YS en(ti, . wey ty)u” 
n=0 


SU" ty 
exp} >) She (5.1.3) 
n=] 


where 
m 
On(t,....t) = oF Pla, =m1,...,0, =mpj}t, | ...07", 
M1 ,...)Mn 


and a, is the number of cycles of length r in a random permutation from S,,. 
If we put ¢, = 1 forr € Randt, =Oforr ¢ R, we find that the corresponding 
generating function @p(t1,..., fn) is 


>> Pla, = mr, réR, a, =0,r ¢ R}, 
Mr 


where 


Mr= M1,...,Mn: rm, =N ; 


reR 
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It is easy to see that 


>> Pla, =m,, réR,a,=0, reR=P| Drama}. 


Mr reR 


Thus, substituting t, = lifr € R andt, = Oifr ¢ R into (5.1.3) shows that the 
generating function for 


P | yore, =| = “at 


reR 


equals 


ane” = 5.1.4 


n=0 reR 
a 


In view of Theorem 5.1.2, it is convenient to apply the saddle-point method to 
obtain asymptotics of Ti? ) In the next section, we will use a different approach 
based on the generalized scheme of allocation; however, for comparison, we now 
present the derivation of the asymptotics of 7 by applying the saddle-point 
method. 


Theorem 5.1.3. Asn — 0, 


7) = ai (ry e* (14.0 (n-M4)). 


Proof. Since 
foe} 


(2) n 
Tn Z z+27/2 
tee Fayre ate Oh 
n=1 


by Cauchy’s formula 


(2) z+z7/2 
Th 1 e 
caer =a! pa 


integrating over an arbitrary contour that goes around the point z = 0. We can 
write 

Fn) ee as / e2tz?/2-n logz 42 

2ni Z 
and choose the contour of integration to be the circle passing through the saddle 
point @, where the derivative of the function 
oe 
f@=2z+ ey —nlogz 
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is zero. From the equation 
f(=l+z- = =0, 
we find that 
Q=,nt+ i =a 5. 
Thus, setting z = ge'?, m < y < m shows that 
1 


F(n) = > ghoe2 = as [ ece'?+o7e%? /2—n log e—ing gy 


2ni Zz 2m Jun 


= 1 eote’/2 . eQ(cos g—1)+(cos 2p—1)/2+ig sin ot+(Q? sin 2)/2—ing gq 
27Q” _ 
i 
For the sake of brevity, we let « = g sing + (97 sin2g)/2 — ng and write the 
integral in the form 
+07/2 
F(n) = allay i cos cre 21 cos 9)—@(1—c0s 29)/2 4 y 
270” 
Kt 


jeote’/2 


u 
FE / sin ae@(1-cos @)—@"(1—-cos 29)/2 4 y 
Q —n 


Since F'(n) is real, we see that 
eota?/2 
270" 
-3/4 


F(n) = [ cos ae@(l—c0s 9) —27(1—c0s 29)/2 gy, 

We choose ¢ = @ and estimate the integral outside the e-neighborhood 
of zero, as n — oo, taking into account that g@ = ./n + 1/4 — 1/2 > ov. The 
integrand is even, so we only estimate the integral over g, 0 < g < 7. It is 
convenient to consider the graphs of the functions cos g and cos 29 included in the 
exponent. With the help of the graphs presented in Figure 5.1.1, we can easily see 
that 


m/{2 2 
| cos ae ell—cos ¢)—oe* (1—cos 20/2 dy 
€ 


/2 /2 
i [ 2 (1—c0s 29) /2 44 2 f en PP 2gy < meen = se Vai, 
€ € 


since 1 — cos2« > ¢? for sufficiently small e. 
Similarly, 


i; os cos ee ~2(1-cos )—97(1—cos 29)/2q 
Xu 


< i. e etl cos 9) gy 
n/2 


8 
< [ e °"dg= Lore, 
n/2 2 


5.1 A quadratic equation 223 


Figure 5.1.1. Graphs of cos g and cos 29 


Thus 


eete’/2 / pe J 
F(n) = ——— cos ae PC eos P)—@"(1 C0 29/249 4. O(eve/?) 
eer Ne ” , 


where ¢ = g~7/4. Since g + e? —n = 0, we find that, in a neighborhood of zero, 


o2 
a= Qaing p= sin 2 — Ap 


= eg +a’y — ny + O(07|g/°) = O(c" Il’), 
and therefore 
cosa = 1+ O(a?) = 1+ O(049°). 


The exponent of the integrand can be represented in the domain of integration 
as follows: 


2 
1 
e(1 —cosg) + ae — cos2y) = 5 (2 + 207)y? + O(07¢'). 


Thus, for |g| < e, 
cos ae2(! C08 ~)-2°(1—c0s29)/2 _ 9 ¥°(2+20°V/2(1 + O(g2y4 + o4y)) 
= eo 9°(0+20°V/2(1 4+ O(g7e4 + o4e°)) 
= e 9047021 + O(g-"/)). 
Therefore 


& € 


—€ —é 
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The change of variables 6 = \/@ + 209 gives 


2 
a / © PMA gs = oY fee 
V2m Je V 2n(@ + 202) J—e/o+2¢ 
1 
= (1+ 0), 
Vo +20? 


since as x —> ov, 
CO 
1 
/ e- 2dy = —e-* 2 4 0(1)). 
~ x 


Combining the estimates gives 


V2700" \ Vo +20? 


eote?/2 
= —____(] + Oo -1/2 
eam 


F(n) = (1+ O(e-)) + O(e-v® ) 


It remains to substitute 9 = /n + 1/4 — 1/2 into this formula. 
Since F(n) = T,/n!, we find that 


2 
1 
log 7. = logn!+e+ - —nloga- 5 los (0 + 20”) — log V27 + O(071/”). 


(5.1.5) 
Replace log n! by Stirling’s formula 


1 
logn! = nlogn —n + = logn + logV2x + O(n™'). (5.1.6) 


It is easily seen that 


1 
je). si 


O 
g=n—Jit5+0(—). (5.1.8) 
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When we use (5.1.7), we find 


1 1 ‘: 
nloge = 57 logn +n log I- sete +0(5 5 
= 5ntogn ~ >¥n + 0 (5 (5.1.9) 
ae 8 2 Jn} LL. 


Finally, 
1 
log(g + 207) = log2 + 2logg + log (1 ts x) 


1 
= logn + log2+ o(= (5.1.10) 


+) 


By substituting estimates (5.1.6)—(5.1.10) into (5.1.5), we obtain the final formula 
for log 7 


Cpa eee Je -1/4 
log 7 = 5 logn 5tvn 5 logv2+ O(n ; 


which implies the assertion of the theorem. a 


5.2. Equations of prime degree 


According to (4.3.4), the number a,_r of permutations in S,,r can be represented 
in the form 


a! pace wr (BR) Bae py e(R) (R) 
n,n en Ee ROPE + HE Sa}, (5.2.1) 


N=1 
where 
Br(x) = 7 a (5.2.2) 
and Cae : 3b? are independent identically distributed random variables, 
p{e® — x} = ee es (5.2.3) 
KBr(x) 


and the positive parameter x can be chosen arbitrarily from the domain of conver- 
gence of the series in (5.2.2). 

If p is a prime number, then the number 
iS Gn,R, Where R = {1, p}. Therefore 


T,” of solutions of equation (5.1.2) 


xP 
Br(x) =x4+—, 
P 
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and by (5.2.1), 


7?) = M oxtxPyp 3 (+2?/P)" -x-x"/p pen =n}, (5.2.8) 
xn aa MN} 
where (ty = & +---+ &n, &,...,&y are independent identically distributed 
random variables and 
px xP 
{1 = 1} oe tae {81 = p} Peta? (5.2.5) 


Thus, to find the asymptotics of 7, (P > it suffices to choose an natn value 
of x and to prove a local limit theorem for the sum ¢y = & + --- + &y. The 
summation of independent random variables taking two values is a Gianie problem 
that is solved by the de Moivre—Laplace theorem. Therefore the approach based on 
the representation (5.2.4) seems more suitable here than the saddle-point method. 

We begin by applying this approach to the proof of Theorem 5.1.3. 


Proof of Theorem 5.1.3. If R = {1, 2}, then obviously 


2 
a Pi, =2}=— . 


Pi =N=55 = Tae ~ 2B(x) 24x" 


where B(x) = Br(x) =x +x7/2, and Ety = NE&, = N(x +x?)/B(x). 
In the main part of the sum in (5.2.4), the parameter N takes values close to 
B(x); therefore we choose x such that 


x+x7=n. 


Hence, 


2 


1 x n 1 1 
a rae aay BO) Et = 5 have dae oe 


x+x? n x? 
= Bay Bay Pf aR Gy’ 


and Dé; = 2n—!/2(1 + o(1)) as n > co (where D denotes the variance). 
Let 


~ 2(N — B(x)) _ 
“=~ BODE ‘; A= /2logn, 


and divide the sum from (5.2.4) into two parts so that 


7 = 


Ss + S2), (5.2.6) 
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where 
BN 
aS Foe BPE =); 
N:lu|<A ; 
BN(x) _ ae 
a= 2 ae Pewee: 
N:|u|>A 


In the first sum, 
|N — B(x)| ws A/Dé, - Vlogn 
JV B(x) 2 ni/4 


and by using the normal approximation to the Poisson distribution, we obtain, as 
n—- oO, 


(1 + 0(1)), 


BN(x) _ ae 1 
—_—@ pee 
N! 20 B(x) 
uniformly in the integers N such that |u| < A. 
The sum ¢y — N has the binomial distribution with N trials and the probability 
of success p(x) = x/(2+ x). If |u| < A, then N = B(x)(1 + o(1)), and 
2xN 
(2+ x)? 
as n — oo. Therefore the normal approximation to the binomial distribution is 
valid. For |ju| < A = /2logn, 
n—-NE&  n(BX)-N) _ 
VNDE\ B(x) / NDE, 


Therefore, by the de Moivre—Laplace theorem, 
1 
/2aNDE\ 

uniformly in the integers N such that |u| < A. 

The behavior of the functions gj(N) = BN (x)e~2™/N! and @(N) = 
P{¢yv =n} is represented approximately in Figure 5.2.1. 

The sum S; can be estimated as follows: 


(1 + o(1)) 


Np(x)(1 — p(x)) = = JYn(1 + 0(1)) 


—u(1+ O(n-"/)). 


Pity =n} = e221 + o(1)) 


N 
8 = ye OPW =) 
N:|u|<A ° 
2h Se eee 
a mee J20 Bx) /22 NDE, 


1 2 2 
— re tren —___ 24 /2 1+ 1)). 
2./2n Bx) V20 oe, VB@)DE Malar 


e221 + o(1)) 
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Figure 5.2.1. The graphs of g;(N) and g2(N) 


The last sum is an integral sum of the function e~“’/2 with step 2(B(x)Dé&1)~/?, 
$0 as n —> 00, 


1 | re, 2 1 
S =e! e~"/2du(1 +0(1)) = ————(1 + 0(1)). 
1 2/in BO) Jit Jeo USO) = Say 
By virtue of monotonicity, for |u| > A, 
1 2 
P =n} < ——————e~*"/?(1 + 0(1)), 
{$n =n} < PGA ( (1)) 
and there exists a constant c such that 
Pity =n} < cn /4e-4°/? < cn9/4. 
Therefore 
BN 
So = > Fe BOP Key =n} <cn7 7/4, 
N:|ul>A 
Thus 
1 
S=S8, + & = $,(1 +0(1)) = ———(1 1)), 
1+ S 101 + o0(1)) 1 Jia BG) + o(1)) 
and by substituting this estimate into (5.2.6), we obtain 
1 eB) 
7? =~ (1 + 0(1)). 
n= oun JieBGy OMY) 


It remains to substitute 


1 1 1 
x=yntlf4—5,  Ba)=St5vatl/a- 5 
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into the formula. It is easily seen that 
eB) = et/2t+Jn/2-1/4 4 +o(1)), 
x" = ntl2e—Vn/24 + o(1)). 
Therefore 


n" /ne~" J Imen/2+Va/2-1/4 
~ pnl2e-Vn122.,./2n Jn [2 


n/2 
= eWAg-2 (2) evict + o(1)), 
e 


7,2) 


and Theorem 5.1.3 with the remainder term of the form 1+ 0(1) is proved. 


We now turn to the case where p is a fixed prime number, p > 3, and consider 
the number 7”) of solutions of equation (5.1.2). 


Theorem 5.2.1. Ifn — oo and pis prime, p > 3, then 


n\n(l—1/p) 
Th? = (=) pen? 4 + (1). 


Proof. The proof is almost the same as the proof of Theorem 5.1.3 given above 
and is also based on relation (5.2.4). For R = {1, p}, 


B(x) = Br(x) =x+x?/p, 


and the independent random variables &), ..., €v in (5.2.4) have the distribution 
x px pxP xP 
P —} 1 Sa a, eee ee | P = —— 
t= Say pei CUP aay pe eae 


We choose the parameter x such that 
x+xP =n. (5.2.7) 
Then 


Y= n'/P ar 1 -142/p a O(n-?*7/P), 
Pp 


= 
B(x) = x +x?/p=— +2 —n!/? + O(n 42/2), 
PP 
xP 
Se ep eg ele —2+2/p 
P(x) Seaae pn + O(n ), 


Eg =n/B(x), Dé = (p—1)?pn!/P(1 4 0(1)). 
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Let 
P(N — B(x)) 
u= ———__—_,, A= /2logn, 
Vv B(x)DE1 
and divide the sum in (5.2.4) into two parts so that 
nt eB@) 
Ty?) = ———(S, + 5), 
x 
where 


BN(x) _ 
S| = y “wr ¢ BO P{ey =n}, 
N:|u|<A i 


BN 
= Te FP tty =n. 


N:|u|>A 


In the first sum, NV = B(x)(1 + o(./ B(x))) and 


BY(x) _ pay 1 
°° RG 


uniformly in the integers N such that |u| < A. 
Let &* = (& — 1)/(p— 1),i=1,..., N. The sum 


ty = 8p ++ +b 
has the binomial distribution with N trials and the probability of success 
p(x) = x? /(px +x?) =1— pn *'/P 4 O(n ?*7/P) 
as n — ov. It is clear that 
P{¢y =n} = P{ty = — N)/(p— D}, 


and if (n — N)/(p—1) is not an integer, then P{¢,, = n} = 0. Since E€; = n/ B(x), 
B(x) = n/p(1 + o(1)), and 


(n—N)/(p-1)-NEET — n—-NE& _ n(B(x) - N) 


JNDE; ~ JNDE B(x) /NDE 
nu 
= SS = - (1 + (1) 
PV B(x)N 
as n — oo and |u| < A, by using the de Moivre—Laplace theorem, we obtain 
1 


Plév =n} = P(t} = (2 - N)/(p-D} = e+ o(1)) 


/2nNDEF 


uniformly in the integers N such that (n — N)/(p — 1) is an integer and [u| < A. 
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Therefore 
BN(x) _ 
S)= “aare BOP {ty =n} 
N:\u|<A 
perl —u? /2 
= a 1+o(1 
oe | VIRB@)2RNDE Eo) 
p-1 1 P 


2 
~ = eee" (1 $:0(1)), 
pV 20 B(x) /2n oe, /2n B(x) DE, 
where the summation is over the integers N such that (n— N)/(p—1) is an integer. 
The last sum is an integral sum of the function e~“’/2 with step p(B(x)D&)7!/2. 
Since the summation is over N such that (n — N)/(p — 1) is an integer, that is, 
only each (p — 1)th term is included in the sum, we obtain 


p=1 P —u2/2 1 i. ste 
— eee > e du=1. 
V20 Wjuied f 20 B(x)DE, V20 J—oo 


Therefore, as n — oo, 


1 
= ———((] 1)). 
Sy Fs in BG) + o(1)) 
For |u| > A, 
= pad ~A2/2 
Pity =n}< Vix BaD e (1+ 0(1)), 


and there exists a constant c such that 
P{ty =n} < en -V/@p) 


and $2 < cn—!—1/@P). Thus 


1 
S=8,+ 8. =S8,(1 1)) = —=—=—(1 + o())), 
1+ S2: = 5)(1 + o(1)) ‘3 ia BG) | o(1)) 
and by substituting this estimate into (5.2.6), we obtain 
B(x) 
2 EE tod 5.2.8 
0 = a ey! to). (5.2.8) 


It is easily seen that 


oF) — et PH(p—Vyn'/?/p cy + o(1)), 


x” = n"/Pe-"?/Py 4 o(1)). 


When we substitute these expressions into (5.2.8), we obtain the assertion of 
Theorem 5.2.1. = 
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A slight refinement of the estimates used in the proof of Theorem 5.2.1 allows 
us to show that the assertion of the theorem is valid if p tends to infinity slowly, 


as specified below, where we prove a more general result. 


Theorem 5.2.2. If pis prime and n, p —> 00 in sucha way that p/n — 0, then 


1/pymt+kp 
Ce) = (YOY) aay (nll 
Ty? = (=) > ea meri tos 6.29) 
in particular, if p~2n!/P — oo, then 
(1-1/p) 
TP = (A) per +000), (5.2.10) 
e 
and if p~'n'/P — 0, then 
nv\n(l-1/p) n™ 
Te?) — (=) pee a +0(1)), (5.2.11) 


where m =n — p[n/p), and (c) is the integer part of c. 


Proof. The proof is similar to the proof of Theorem 5.2.1, but now we need to trace 
the effect of the parameter p in the remainder terms of the asymptotic formulas 
and to use a representation in terms of the Poisson probabilities instead of the 
representation (5.2.4). 

It follows from the equation x +x? = n thatunder the conditions of the theorem, 


2/p 3/p 
x=nlP_" 40 (5) ; (5.2.12) 
np n? p 
~1)n!/P 2/P 
Pasa ee eo (—) (5.2.13) 
P np 


Therefore it is easy to confirm that 


xP 
=P = _ == —1+1/p O —2+2/p : 
PO) SPIES Phe eg hee OR) 


The random variable (¢y — N)/(p — 1) can be represented in the form 


where 7 has the binomial distribution with N trials and probability of success 
q = 4(x) =1— p(x) = pn P14 O(n '*"/?)), (5.2.14) 


Therefore it is not difficult to see that forn = m + pjn/pl, the probability 
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P{¢y =n} is nonzero if 
N=[n/p)+m+k(p-1), 0<k<[n/pl, 
and for such N, 


P{ty =n} = P{nn =]}, 
where / = m + kp. Thus, the representation (5.2.4) takes the form 


nt 2 BN 7 
TP = oy wr? BP{ty =n} 
N=1 . 
[n/p] 
= hy F8(™ ta — gy 
xn rar N! 1 


This results in the representation 


[n/p] 1 N-I 
(p) _ nn! (Bq) _,, (BU - 4)) —B(1-) 
TP) — gq NaN Ere q 2. 
ae 2 ne (N—-DL a 


where / = m + pk, N = [n/p] +m+k(p—1),m =n — p[n/p]; and to obtain 
the basic assertion of the theorem, we must sum the products of two Poisson 
probabilities. Let 


[n/p] (Bqy™+Pk 


s= > ae a= (nl? /pjn)'”?, 


- ! 
fog (m + pk)! 
and divide s into two parts, 


3 = (BqyntP* Ba 
t > 
k:|(N—B)b-1/2| <a Ve ADEN 


(BQ ES aig 


aa (m + pk)” 


k:\((N-B)b-!/2|>a 


Note that a — 0 under the conditions of the theorem, and the normal approxi- 
mation to the second multiplier 


(Bd - qt —B(1-q) 1 

—___+__e¢ = ——/(1+ 0(1)) (5.2.16) 
(N -D! J2xB 

is valid for all /, N such that |(N — B)B~/?| < a, and outside this region, 


(Bd - gyn |-Ba-9) < c 
(N-1)! J/22B 


(5.2.17) 


where c is a constant. 


234 Equations containing an unknown permutation 


It remains to show that sz = o(s;) and 


= oo (n'/Pym+ pk apie 
De emer ee (5.2.18) 


For the sake of brevity, we let b = Bq. It follows from (5.2.13) and (5.2.14) 
that under the conditions of the theorem, 


b=n'/P(1 + O(pn!*1/P)), (5.2.19) 


It is clear that 
plel+p 
([6] + p)! 
since at least one of the summands with / from the interval ([b], [b]+ p) is included 
in the sum sj. 


On the other hand, the summation over N > B+ a/B is the summation over /, 
with 1 = m+ pk such that! > b+a/B+0(/B).Letlo = b+aVB+0(VB). 


Then, 
52 < on 1+ u + be + < phe "ho < eb 
? 0! 0 ~ Iolo — 6) ~ Int’ 


since b/1lg — 0. Therefore, 


S2 chlo-lbl-p 
pa he 
Ss) Indo —1)--- (6b) + p+) 
c 
ie es 
~ (1+ Uo — 6)/b)--- + (6) — 6+ p+ 1)/d) 
cb? c2b3 3n?/P p>/2 


< ——_— < < 
~ (o-b3 ~ @VBR ans?’ 


where cj, ¢2, and c3 are constants. By the choice of a, the last bound tends to zero. 
This estimate, (5.2.16), (5.2.17), and (5.2.19) imply (5.2.18). Assertion (5.2.9) 
follows from (5.2.15), (5.2.16), (5.2.17), and (5.2.18). 
If p~*n!/P — oo, then by using the normal approximation, we obtain 


(alipyetr’ nie _ | 
Dee Gime. Sp Om: 


This yields assertion (5.2.10) of the theorem. 
Assertion (5.2.11) follows from the fact that if p~!n!/? — 0, then 


90 (n'/pym+ pk 


s nad) 
————-— = —(1 + o(1)). 
= (m + pk)! m! 
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5.3. Equations of compound degree 


In this section, we consider the number TO of solutions of the equation 
X4 =e, (5.3.1) 


where d is a natural number, e is the identity permutation, and X is an unknown 
element of the symmetric group S,,. The cases where d is a prime number were 
considered in the previous sections. Let d be a compound number and let 1 = 
dy < dy <--- < d, =d be all different divisors of d. A permutation X is a 
solution of equation (5.3.1) if and only if the lengths of cycles of X belong to 


the set {dp, ..., d,}. Therefore TO is equal to the number a, r of permutations 
in S,,r, Where R = {do,..., d,}. The following is a generalization of Theorems 
5.1.3 and 5.2.1. 


Theorem 5.3.1. Ifn — co and d is a fixed number, d > 2, then 


j/d 
1 = (2) n-vld exp nee (1+ 0(1)) 


if d is odd, and 
j/d 
7 = (= *)" n/4 exp re -5 (1 + 0(1)) 


if d is even. 


Note that the summation in the above formulas is over the divisors j of the 
number d, and if we put d = 2 and d = p, we obtain Theorem 5.1.3 and 5.2.1, 
respectively. 


Proof. Let 1 = dy < d; < --- < d, = be all the divisors of d, R = 
{do,..., dy}, 


B(x) = Br(x) = yo — 


keR 
and let ;,..., & be independent identically distributed random variables, 
ye 
P kKeR, 5.3.2 
{fj =k} = EB)’ (5.3.2) 


where the positive parameter x can be chosen arbitrarily. Since d is compound, 
r>2., 
Put (vy = &; + --- + &y. It is clear that 


Etéy = NEd, = (x $x4 4... 4 4-1 + x7) /B(x). 
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We choose the parameter x such that 
xtxh yp... txt tx4 =n, (5.3.3) 


and in what follows, we consider the random variables é, ... , &y with distribution 
(5.3.2), where x is the solution of this equation. 
By iteration, it is not difficult to determine that 


x? =n —ntl@ _.... — 1/4 + (1) (5.3.4) 
if d is odd, and 
x4 =n—nt-/4_...-nV/4 41/24 0(1) (5.3.5) 


if d is even. 
Since 7 = Qn,r, Where R = {1,d),...,d,-1,d}, we can use the represen- 
tation (5.2.1) and obtain 


00 PN 
@) _ (Bey) Jr BLO) apy, — 
Ty) = er) De Phéw =a}. (5.3.6) 
N=1 
Therefore, to obtain the assertions of Theorem 5.3.1, it is sufficient to find the 


asymptotics of P{fy = n}. 
It is not difficult to see that 


n 
aay 
&1 BG) 
Dé, = B(x) (x t+dyxt 4... + dx?) ae! 
oe Bx) 
BG SA 6 ee a, 
d\ d ra 


where the summation is over the integers j, which are the divisors of d. In view 
of (5.3.4) and (5.3.5), 


B(x) = }) —( + 0(0)), (5.3.7) 

: J 

did 
as n — oo. By estimating the second and third central moments of & and using 
the characteristic function of {y, we can prove that the distribution of the random 
variable (€y — NE&)/./NDé&, converges to the normal law with parameters 
(0, 1) as ND&, — ov. If h is the maximal step of the lattice containing the set R, 
then the local limit theorem is valid on this lattice. We omit the proof of this local 
theorem. 
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The remaining part of the proof of Theorem 5.3.1 repeats the corresponding 
part of the proof of Theorem 5.1.3 from Section 5.2. We put 


ait NE& a d(N — B(x)) a 
0= TDA ; u= RODE ; A =2,/2logn, 


and divide the sum from (5.3.6) into two parts so that 
! 
Ee * BOS, + S), 
x 
where 


BY) oy 
“S= 5 ai BOP (cy = n}, 
N:|u|<A 


N 
TOF Ey =n}. 
N} 
N:|u|>A 


S2 


It is easy to see that N = B(x)(1 + 0(1)) for |u| < A = 2,/2 logn and 


» = 1CBR) = N) _ 
B(x) /NDE, 


and by the local limit theorem, 


—u(1+ O(n-"/”)), (5.3.8) 


h 
P{févy =n} = JVigNDA 


uniformly in the integers N such that |u| < A and (n — N)//h are integers. Recall 
that 4 is the maximal span of the distribution of &}. 
As in the proof of Theorem 5.1.3, Section 5.2, we obtain 
1 1 dh 


Caen ae ee er pe 
din Bo) V2 = 4, VB@)DE 


e721 + 0(1)) 


e"/24 4 o(1)). 


The last sum is an integral sum of the function e7”/2, with step d(B(x)D&)~!/?, 
and the summation is over N such that (n — N)/h are integers, that is, only each 
Ath term is included in the sum. Since / and d are relatively prime, we see that 


1 hd —u?/2 1 [ —u?/2 
— ————e > — e du=1, 
/2n oe, VB(x)DE, V2n J—oo 


and 


S} (1 + 0(1)). 


1 
~ d./2n Bx) 
In estimating Sp, it will not be possible now to use the monotonicity of the tails 
of the function g2(N) = P{{y = n} as we did in the proof of Theorem 5.1.3 in 
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Section 5.2 (see Figure 5.2.1). By (5.3.8), in the first sum, |v| < /2logn fora 
sufficiently large n. Therefore, in the second sum, 


Pityv=n}<  S> Pity =n). 


nijv|>/2logn 
By the integral limit theorem, 
Ds P{éy =n} = ee i e~*’/2dz(1 + 0(1)), 
V2a J./Ziogn 


n:|v|>,/2 log n 


and there exists a constant c such that, in the second sum, 


P{ty =n} < cn}, 
Thus, S; + S2 = S;(1 + o(1)), and we obtain 


T@ — _nter®) 
e x"d./2a B(x) 
This implies the assertions of the theorem because 


“i 
28) = exp) )>—F, 
ja 


(1+ 0(1)). (5.3.9) 


and x” can be represented in the cases of odd and even d as follows. 
Let d be odd, then according to (5.3.4), 


xl = ntlde Ott id 1 4 9(1)), 


For 1 < j <d, 
x/ =n//4 4+ o(1), 
and for j = d, 
x4 =n —nt-/4_... V4 = (1). 
Thus 
e5@) — exp re Ent oo nll) +01) : 
Jia 
and 


nila 
xe) — y-nld exp a —— +} (1+0(1)). 
j\d 
When we substitute the last expression into (5.3.9), we obtain the first assertion of 
the theorem. 
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If d is even, we note that 2d,_; = d and use (5.3.5) to obtain 


ye ntld pW pn '/4 1/2) /d-1/ 2d) (4 +0(1)). 


For 1 < j < d,-1, 
xJ =n/4 4 0(1); 
for j =d, 
x? =n—n*-/4 _...—n'/4 41/24 0(1); 
and for j = d;-1, 
xe = nt-i/4 _ g/d +0(1). 


Thus 
B nif d,—\/d 1/d 
oF) = exp} ) > — — (n-/4 4... +n'/4 — 1/2) /d +011) f, 
Jj\d 
jld 
xeB®) = x—"/4 exp ae — =} (1+o(1)). 
i 2d 


The substitution of the last expression into (5.3.9) gives us the second assertion 
of the theorem. a 


5.4. Notes and references 


The study of equations of the form X' 4 — ein the symmetric group S, is directly 
related to one of the significant characteristics of the elements of S,: the order 
of permutations. By the order O,,(s) of a permutation s € S,, we mean the least 
positive integer k such that s* is the identity permutation. The orders of elements 
in S, vary from 1 to the maximal value G(”) over alls € S,.E. Landau [95] shows 
that 
log G(n) 
m ——— = 1 

n> ./nlogn 
In spite of such a wide range of log O,(s), the typical values of log O,(s) are 
considerably less than log G(n) and are concentrated near 2~! log? n. Let O, be 
the order of arandom permutation from S, with uniform distribution. The following 
assertion is well known. 


Theorem 5.4.1. For any fixed x, 


1 * 2 
. elliot pees -u/2 
lim P{ (log On — 27" log’ n)/\/ 37! log n} a e du. 
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The asymptotic normality of log O, was first proved by P. Erdés and P. Turan 
[39]. Other proofs of Theorem 5.4.1 can be found in [106, 18, 27]. All the proofs 
are rather cumbersome and involve many analytical difficulties. From our point 
of view, the simplest proof, but still not a sufficiently simple one, is suggested in 
[78], where the approach based on the generalized scheme is used. 

It seems to us that investigating the numbers of solutions of equations of the 
form X? = e could provide the basis for the study of the local behavior of Op. 
Indeed, if p is prime, then TP ) is just the number of permutations s € S, whose 
order O, (s) = p. Since the leading term of the asymptotics of the number 7 for 
a compound d is (n/e)"“'—!/), almost all permutations counted by TO probably 
have the order d. It would be of considerable interest to find the asymptotics of the 
local probabilities P{O, = d} for d that lie in a neighborhood of exp{27! log” n} 
and to see whether the integral limit theorem follows from these results in spite 
of the fact that the behavior of the probabilities P{O, = d} is likely to be rather 
complicated. By virtue of the irregularity of the behavior of P{O, = d}, this 
problem is not usually as trivial as is obtaining the integral limit theorem from the 
local theorem because now we have to obtain the local theorem for d of a specified 
form and, in addition, we have to know how many d of such a form exist. 

Theorems 5.1.1 and 5.1.2 for R = {1,2} and Theorem 5.1.3 were proved in 
[32]. Theorem 5.1.2 for R = {1, p}, p > 2, was proved in [61], and for an arbitrary 
R in [33]. 

Theorem 5.1.3 was proved in [103], where the result of Theorem 5.2.1 was 
also presented. Assertion (5.2.9) of Theorem 5.2.2 was proved by the saddle-point 
method in [144]. 

Theorem 5.3.1 was proved in [108, 145, 150] independently and almost simul- 
taneously. 

The approach based on the generalized scheme of allocation, presented in Chap- 
ter 5 of this book, was first published in [82], where the proof of Theorem 5.1.3 was 
realized with the help of this approach. The proof of Theorem 5.3.1 in Section 5.3 
follows A. V. Kolchin [68], who, in addition, extended this theorem to the case 
d — o such that d InInn/Inn — 0. 

The general conditions of existence of a solution of the equation X' d — a, where 
a is a fixed permutation and X is an unknown permutation from S,, are given in 
{102]. 

The system of equations 


XpisXyP s+ = , =e, 


where k > 2,m,,..., m, are fixed natural numbers, X),..., X; € S,,ande is the 
identity permutation in S,, is considered in [110]. The asymptotic representation 
of the number of solutions X = (X,..., X) such that X;X; = X;X; for all 
i # j is found. 
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